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A. INTRODUCTION 
1. Historical Background 


URING World War II, psychologists 
D put forth an immense amount of 
effort to improve techniques of selection 
and placement of personnel in the mili- 
tary services and in essential civilian 
activities. Among the most outstanding 
services were those of a group of psy- 
chologists who participated in the re- 
search activities of the United States 
Army Air Forces Aviation Psychology 
Program. With almost unlimited num- 
bers of subjects, they were able to under- 
take many experimental studies and to 
evaluate the effectiveness of many tools 
of research which had been largely de- 
veloped in the non-emergency atmos- 
phere of college and university campuses. 
Statistical tools which had been of pri- 
marily academic interest were brought 
to bear upon the solution of numerous 
practical problems. 

One such statistical device is factor 
analysis. Probably more than any other 
tool, factor analysis has served to ex- 
pand the horizon of psychological test- 
ing in the last five years. In fact, many 
previous ideas concerning the require- 
ments of a good test have been seriously 
challenged, and new standards of test 
evaluation have had to be proposed (10). 

At the time of cessation of hostilities, 
considerable research remained to be 
completed upon projects already in 
progress—to say nothing of that research 
which had been reserved for future proj- 
ects to be authorized. It is from such 
uncompleted research that the data of 
this study originate. In addition to the 
factor analyses to be presented in this 
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THE PROBLEM AND ITS BACKGROUND 


investigation, numerous other research 
problems for which data are available re- 
main that may throw additional light 
upon the persistent problems encoun- 
tered in psychological and educational 
measurement. 


2. Prerequisite Background for 
Understanding the Statement 
of the Problem 


Since there are several approaches to 
factor analysis, it should be mentioned 
that the Thurstone system with its ap- 
propriate terminology will be employed 
throughout this investigation (5). This 
system received major emphasis during 
World War II in the research units of 
the Aviation Psychology Program of the 
Army Air Forces. Its value in the con- 
struction of tests for selection and place- 
ment of Air Force personnel has been 
demonstrated again and again. The fol- 
lowing advantages have justified the use 
of Thurstone’s centroid approach in the 
studies of human abilities: 


(1) The smallest number of psychologically 
meaningful and relatively independent factors 
necessary to describe a table of (test) intercor- 
relations may be ascertained—factors which are 
much easier to interpret than those derived by 
several other systems. 

(2) The weights of each test with respect to 
the identified factors, or the amount of primary 
ability represented by a test or a criterion, is 
easily determined. 

(3) Regression equations may be set up to 
predict, from the tests, the amount of an indi- 
vidual’s ability in each of the independent (or 
nearly independent) factors. 

(4) Correlations between tests in terms of the 
sum of cross-products of paired factor loadings 
are easily obtained, and original intercorrelations 
of tests are easily reproduced. Moreover, cor- 
relations between tests and a criterion are 


effected which heretofore have been possible 
only through more cumbersome techniques. 
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(5) The presence of correlation between fac- 
tors is not excluded and may be readily demon- 
strated through use of oblique (non-orthogonal) 
axes.* 


B. THE PROBLEM 
1. Statement of the Problem 


The purpose of the investigation was 
to ascertain the contributions of factors 
both to the description of tests and to 
the predictive values of these tests in 
two pilot populations of the United 
States Army Air Forces. Specifically, for 
two groups—815, West Point Cadets and 
356 Negro cadets of significantly differ- 
ent standings in their mean composite 
scores and in their mean scores in tests 
identical to the two batteries—the prob- 
lem may be clarified in terms of the 
following two major questions and sub- 
questions: 

(1) What differences were present in 
the factorial composition of two test 
batteries and of a pass-fail criterion? 

(a) How many factors or abilities 
might be found within each of the two 
matrices of intercorrelation? What was 
the smallest number of factors by which 
each matrix might be described? 

(b) What were the factors? In other 
words, how might the factors be in- 
terpreted? Were they meaningful? Were 
the factors the same for the two matrices? 

(c) What were the weights of the 
factors or abilities in each test in its 
respective matrix? 

(d) For both matrices, how complex 
factorially was each test? In other words, 
how many factors were involved in each 
test? How nearly pure was each test? 

(e) Were the factors derived from each 


' For the reader who is interested in additional 
background material concerning the research 
activities of psychologists in the Army Air 
Forces, three publications (listed in the Bibli- 
ography as 21, 22, 23—especially 23)—are helpful. 


matrix independent or uncorrelated with 
one another? 

(f) Were the factors derived from the 
two matrices and their corresponding 
weights essentially the same as those 
found in previous analyses (in which a 
larger number of tests have been used) 
for a white cadet-pilot population?? 

(g) Were the factors and their weights 
for those tests which were common to 
the two matrices noticeably different? 

(h) What new factors, not uncovered 
by previous analyses, appeared in either 
matrix? 

(i) For the two groups, what differ- 
ences appeared in the pass-fail criterion 
as to the number and the identification 
of factors, and as to their respective 
loadings? 

(2) What differences were present be- 
tween the two groups in the prediction 
of the pass-fail pilot criterion from the 
scores of the tests in the respective 
batteries? 

(a) What was the validity of each test 
battery in terms of the coefficient of 
multiple correlation between the pass- 
fail criterion and the test scores of the 
battery? 

(b) Approximately how much did each 
test of a battery contribute to the coel- 
ficient of multiple determination? For 
those tests which were identical in the 
two matrices, were there marked differ- 
ences in the variance-contributions to 
the total predicted variance? 

(c) How closely might the validity of 
each test battery be estimated from the 
total common-factor variance of the 


* The results of several previous factor analyses 
have been combined in a report (24). In this 
report are the “weighted averages” of the factor 
loadings, communalities, reliabilities, and validi- 
ties of tests from fifteen batteries upon which 
analyses had been performed. The document 
henceforth will be called Composite Factor 
Analysis Summary. 
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pass-fail criterion? In other words, how 
closely did the sum of the common-factor 
variance in the pass-fail criterion ap- 
proximate the coefficient of multiple de- 
termination corresponding to each 
battery? 

(d) What were the estimated validities 
(correlation coefficients) of each test with 
the pass-fail criterion in terms of the 
sum of the cross products of the paired 
factor loadings in the test and criterion? 

(e) Did any tests used in either group 
contribute a marked degree of unique 
variance which materially influenced the 
validity of the battery? 


3. Importance of the Problem 


If, with two such diverse populations® 
as those represented in this investigation, 
the same factors with their respective 
weights approximately equal should ap- 
pear in the analyses to be highly valid 
in the prediction of a pilot pass-fail 
criterion, then an inference could be 
made that the same tests or test batteries 
might be used in the selection of pilots 
despite marked differences in the charac- 
teristics of populations. Such a finding 
may lead one reasonably to believe that 
a relatively small number of empirically 
determined factors in certain amounts 
is essential to the success of pilot trainees 
of heterogeneous racial, intellectual, 
educational, and socio-economic back- 
grounds. Moreover, if the results of the 
factor analysis of the batteries adminis- 
tered to the two groups should approxi- 
mate those of several previous analyses 
(reported in (24)), then the inference sug- 
gested could be made on empirical 
grounds with a still higher degree of 


“In Chapter II the populations will be defined 
in operational terms with respect to their stand- 
ings on a scale resembling that of standard 
scores (the pilot stanine). 
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confidence. In short, a plausible infer- 
ence could then arise that individuals of 
diverse backgrounds would have to 
possess certain important psychological 
factors in sufficient amounts to be suc- 
cessful pilots. Such a conclusion, if valid, 
would permit the use of the same tests 
in predicting a criterion for numerous 
populations and would minimize con- 
siderably the work of the psychologist. 

On the other hand, if the empirically 
determined factors in the two groups 
should differ considerably with respect 
to their types, their number, and their 
loadings on both the tests and the 
criteria, then obviously two test batteries 
would be necessary. Although such an 
outcome would limit the range of ap- 
plicability of a test battery, the psycholo- 
gist interested in achieving maximum 
validity for the batteries corresponding 
to the two populations would be in a 
position through his knowledge of the 
factorial composition of each criterion 
and of the tests to effect many improve- 
ments. 

Such possibilities for improvement, of 
course, would be revealed whether the 
factor analysis demonstrated similarities 
or differences in the factorial composi- 
tion of the two batteries. When a cri- 
terion is included with the tests of a 
battery, several important advantages 
accrue in the factor approach. With a 
clear-cut and quantitative picture of the 
factorial structure of both the tests and 
the criteria, the test technician has 
achieved a degree of satistical control 
over a test battery which enables him to 
take steps to maximize its validity 
through retention of certain tests—espe- 
cially pure tests containing relevant 
variance—revision of others, elimination 
of some containing large amounts of 
irrelevant variance, and addition of tests 
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contributing new variance held in com- 
mon by the criterion. Hence, some im- 
portant outgrowths of this study may be 
several practical suggestions for further 
experimentation with the tests so as to 
increase their validity. 


C. DEFINITIONS OF TERMS 
1. Factor 


Nearly synonymous with the interpretation of 
ability and trait, the meaning of a factor may 
be considered a convenient theoretical construct 
or intervening variable which is functionally re- 
lated both to antecedent S-variables and to con- 
sequent R-variables.* The S-variables refer to the 
test-items which sample, in a more or less con- 
trolled manner, the predictive abilities of the 
examinee. The S-variables, frequently referred 
to as “independent” variables, are subject to a 
considerable degree of manipulation or modifica- 
tion (as in two forms of a test designed to 
measure the same function, or in two tests pur- 
porting to measure different functions). The R- 
variables are the “dependent” variables. Sche- 
matically, the relationship involved may be pic- 
tured as S—> I— R. Upon the assumption that 
a workable scale of measurement with an 
arbitrary zero and approximately equal units 
can be effected, scores for individuals may be 
determined and classified. The application of 
factor analysis to correlations derived from these 
measures serves merely to simplify the relation- 
ships among the quantified response (dependent) 
variables in terms of fewer, somewhat more 
stable, unit patterns (factors) which are func- 
tionally related to the stimulus (independent) 
variables. Any interpretation, however, of what 
these relatively independent unit patterns signify 
is a theoretical construct, intervening variable, 
or inference, capable of being related to both 
the antecedent and the consequent conditions 
described. In short, factors are considered to be 
functional unities of reasonable stability, de- 
scribing in a simplified manner the fundamental 
psychological operations of the individual as he 
reacts to a variety of different tests, problems, 
or situations subject to a greater or lesser degree 
of experimental control. 

Considerable controversy persists as to the 
value of introspective reports of subjective ex- 


*The operational definition of factor follows 
closely the technique employed by Hull and 
Spence in their definitions of other constructs 
such as drive. For a detailed consideration of 
the role of intervening variables in psychological 
theory, see (4) and (14). 


periences in the interpretation of a theoretical 
construct. (12), Reports based upon introspection 
of several examinees might supply information 
concerning what influence the test exerts upon 
their actions. Since the factor analyst himself 
resorts to considerable introspection in his at- 
tempt to give a meaningful interpretation of a 
factor, the combination of reports of both the 
technician and examinee might yield a somewhat 
more universally acceptable understanding of 
the factor, However, as most operationists will 
maintain, an intervening variable to be useful 
need not be defined in terms other than those 
which relate it functionally to the operations 
involved in the specified situation. Labelling a 
construct verbally does, however, allow a more 
convenient mode of reference to it. 


2. Validity 


In terms of the factorial composition 
of a test and criterion, a relatively new 
approach to a_ validity coefficient is 
possible. Just as the correlation between 
two tests may be expressed as the sum 
of the cross-products of the loadings of 
paired factors, so may the correlation 
between a test and criterion be esti- 
represent the factor loadings of a test, 
for the factor loadings of a criterion. 
The correlation between the test and 
criterion is determined as follows: 


Tie = Kig + Keay ki, 


Since the reliability of a test (or criterion) 
is that variance which equals the sum of the 
common factor variance and specific variance o! 
a test (or criterion), there is a tendency for the 
correlation between a test and criterion to be 
underestimated if the specific variance in either 
or both is high (upon the additional assump 
tion that the error variances are uncorrelated 
with each other and with the non-error vari 
ance). If additional tests can be brought into 4 
battery to aid in the identification of a new 
factor common to a given test and criterion 
(with a resulting drop in specific variances) 
then the validity coefficient can be considerabl\ 
raised (upon the assumption that the loadings 
upon the other factors in the test and criterion 
do not decrease and that the loadings on the 
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new factor in the test and criterion are of like 
sign). With a sufficient number of tests in a 
battery for identification of factors, the validity 
coefficients obtained closely approximate those 
ascertained from the use of traditional methods 
(23, 797-859). 


3. Stanine 


A term coined and frequently used in the AAF 
psychological research units during the war was 
stanine. A distribution of a large number of 
representative composite scores from a given test 
battery may be scaled in terms of standard scores 
or according to a T-scale or C-scale technique. 
‘The stanine is an adaptation of Guilford’s eleven 
point C-scale system (2). A stanine is a Guilford 
C-scale in which the o and 1 scaled scores and 
the g and 10 scaled scores are lumped together. 
Hence the stanine is a C-scale of nine scaled 
score intervals instead of the customary eleven. 
Therefore the percentages of scores (to the 
closest whole numbers) are 4, 7, 12, 17, 20, 17, 
12, 7 and 4. With respect to a standard repre- 
sentative group, an examinee’s score may be 
easily converted from a centile score or standard 
score to a stanine. 

For each new classification battery, stanines 
were established for the scores of bombardiers, 
navigators, pilots, and in some instances, for the 
scores of other groups. The norms for these 
stanines were based upon the scores of several 
representative groups of cadets taking the classi- 
fication battery at various training centers 
throughout the United States. The scores of all 
groups were converted to scale values determined 
from these standardizing groups. Cut-off points 
were established for elimination of cadets, fre- 
quently as a result of determination of the pro- 
portion of trainees at each stanine level elimi- 
nated. The basis of elimination was the pass-fail 
criterion, the next term to be defined. 


4. Pass-fail Criterion 


Whether a cadet should pass or fail frequently 
rested upon highly subjective grounds. The 
criterion was subject to considerable variation 
from one training unit to another. A given cadet 
was passed or failed on the basis of a rather 
unstable “weighting” of grades earned in subjects 
studied in ground school and of ratings reported 
by flight instructors and qualified judges. The 
decisions of boards of military personnel were 
based upon those grades and ratings as well as 
upon their impressions of the cadet gained 
through personal interview and general military 
record. In many instances periodic physical ex- 
amination served to eliminate a small proportion 
of cadets from training. Trainees expressing 
dissatisfaction or fear were eliminated. 
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Despite the inherent weaknesses of such a 
criterion, it was used as a basis for validation of 
tests. Biserial coefficients of correlation were 
computed between the dichotomous pass-fail cri- 
terion (assumed to be a continuous variable 
normally distributed) and the more-or-less sym- 
metrically distributed scores on a given test. 


In addition to these four important 
terms, others might also be defined. A 
more functional approach appears to be 
that of elucidation of concepts or terms 
when necessary at appropriate places 
throughout the text. However, the four 
terms defined do represent a necessary 
and workable nucleus. 


D. REVIEW OF RELATED LITERATURE 


Relatively few studies have been made con- 
cerning the effect of two or more populations 
upon the factorial composition of a test battery. 
A few studies concerning the effect of groups 
differing with respect to age (7, 8, 13) or sex 
(19) upon factorial composition of tests have 
been performed. However, these studies are not 
particularly applicable to the problem of this 
investigation, since the two male populations 
of Negroes and West Point Cadets are repre- 
sentative of nearly the same age range. 

In three recent articles of a highly theoretical 
nature, Thomson (15), Thomson and Leder- 
mann (16), and Thurstone (18) have investi- 
gated the influence of univariate selection and 
multivariate selection in the factor analyses of 
tests of abilities. Thurstone has used the for- 
mulas developed by Thomson, as well as his 
definitions of terms, and has expanded somewhat 
upon the work of these British investigators. 
The fundamental problem proposed by these 
three psychologists is to determine the alteration 
in the factor pattern of test scores of an initial 
group when a second comparable group (the 
experimental group) is selected on the basis of 
their homogeneity of performance on one test 
(or several tests) of the battery—called a selection 
variable or a criterion test. 

Thurstone is able to show that the inclusion 
of one or more additional tests which are linear 
combinations of those tests initially present in 
the battery results in the addition of one or 
more factors that are incidental. If simple struc- 
ture is apparent for the initial test battery, the 
addition of new tests does not alter the structure 
except in the instance of a large number of 
incidental factors, the variance of which may 
mask that of the common factors. 

However, Thurstone’s findings are at best only 
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indirectly applicable to the two populations of 
West Point and of Negro pilot cadets, in that 
no selection test, such as the AAF Qualifying 
Examination, was included in the test batteries. 
(The cadets were selected on the basis of 
their total, or composite, scores on the classifi- 
cation battery. There was, of course, some degree 
of selection in pilot training as in any other 
occupation.) Nor can the pass-fail criterion be 
employed, since it was not used for selection of 
examinees, but followed by several months the 
administration of the tests. Moreover, to explore 
the problem as set up by Thurstone would 


require for best results two control populations 
in addition to the two experimental populations. 

In short, no investigations reported in the 
literature seem to bear directly upon the prob- 
lem of the study. At appropriate places in the 
following chapters, however, references will be 
made to pertinent investigations which may 
furnish needed clarifications of the text. For 
general background to the problem the reader 
is referred to the previously mentioned publi- 
cations of the AAF Aviation Psychology Program 
(21, 22, 2g). 
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N THE following three sections an at- 
it tempt will be made: (1) to define in 
operational terms the two previously 
mentioned populations of West Point 
Cadets and Negroes selected for investi- 
gation of the problem of the dissertation; 
(2) to present a rationale for the inclu- 
sion of tests in the two batteries (the 
intercorrelations of which will be sub- 
jected subsequently to factor analysis); 
and (3) to describe the tests with respect 
to their purpose, content, scoring formu- 
lae, and time-limits. 


A. AN OPERATIONAL DEFINITION OF THE 
Two POPULATIONS OF NEGROES AND 
West Point CADETS 


In a previous paragraph (I, B, 3), 
reference was made to the fact that the 
two groups of West Point Cadets and 
Negroes were diverse. Although from a 
qualitative standpoint differences in the 
two groups with respect to such charac- 
teristics as race, amount of previous edu- 
cation, socio-economic status, and so 
forth, may be described, an operational 
approach in terms of the difference in 
their standings in the pilot stanine and 
in individual tests common to the two 
batteries is both meaningful and precise. 

One serious difficulty arose in the 
comparison of the two groups with re- 
spect to their means and standard devia- 
tions on the pilot stanine, in that the 
composite scores of each group were 
based on two different test batteries. Two 
reasons, however, justified the computa- 
tion of a t-ratio as a test of the signifi- 
cance of a difference between means 
(actually a test of the hypothesis that the 
two samples were drawn from the same 
population). First, the obtained mean 
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stanine of each group stood in reference 
to a mean stanine of 5 (and approximate 
standard deviation of 1.96) which was 
derived for two representative white 
aviation-cadet populations. Each of these 
two populations was a composite of 
samples obtained at various training 
centers throughout the United States. 
Second, the mean stanines of two repre- 
sentative white cadet populations upon 
the November, 1943, and September, 
1944, battery were regarded as relatively 
stable (prior to the elimination of any 
cadets below the stanine cut-off point). 
In other words, a negligible difference 
between the mean stanines of two repre- 
sentative white-cadet populations should 
be expected from one year to the next 
on two comparable test batteries. 


1. Difference Between the Negro Popu- 
lation and the West Point Population 
Statistically Defined 


Upon the assumption that a stanine 
of 5 on the November, 1943, battery was 
the same as a stanine value of 5 on the 
September, 1944, battery, t-ratios were 
computed to obtain the degree of signifi- 
cance of the difference between the ob- 
tained stanine means of the two groups. 
Since a distinction between fighter pilot 
and bomber pilot was employed at the 
time of administration of the Septem- 
ber, 1944, battery, two t-ratios were 
computed to test the significance of 
differences (1) between the mean pilot 
stanine of the Negroes and the mean 
fighter pilot stanine of the West Point 
Cadets and (2) between the mean pilot 
stanine of the Negroes and the mean 
bomber pilot stanine of the West Point 
Cadets. The means and standard devia- 
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tions for the 815 West Point men and 
356 Negroes are presented in Table 1. 
The t-ratio for the difference between 
the mean of the Negro pilot stanine and 
the mean of the West Point fighter pilot 
stanine was significant beyond the one 
percent level (¢ = 21.3 for 1169 degrees 
of freedom). For the difference between 


spond with the standard scores of the 
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West Point group, the number of tests 
for which comparisons of means might 
be effected was limited to seven (all 
pencil-and-paper tests). All seven differ- 
ences between means for the two groups 
were statistically significant beyond the 
one percent level. 


STANINE MEANS AND STANDARD DEVIATIONS OF NEGRO 


CADETS AND WEstT Point CADETS 


Group 


Number 


Standard 


Mean Deviation 


I. Negro Cadets 356 


II. West Point Cadets 815 
A. Fighter Pilots (815) 
B. Bomber Pilots (815) 


3-95 1.61 


I. 
7-09 1.59 


the same mean for the Negroes and the 
bomber pilot stanine mean the degree of 
significance was still higher (¢ = 30.8 for 
1169 degrees of freedom). Obviously, it 
was reasonable to assume that the two 
populations were not homogeneous. 

In eight other comparisons between 
the mean stanine scores of these two 
groups and the new stanine scores of 
eight representative white-cadet samples 
on respective test batteries, statistical 
significance was revealed beyond the one 
percent level. (The data for these com- 
parisons were taken from Tables 3.17 
and 3.20 in (21) and from (25)). 


2. Differences Between Mean Scores in 
Individual Tests Administered to 
Negroes and West Point Cadets 


Since there were only thirteen tests 
common to the November, 1943, and 
September, 1944, batteries, the number 
of comparisons of scores was limited. 
Moreover, since the raw scores of the 
Negroes upon the six psychomotor tests 
(the only scores available) did not corre- 


B. RATIONALE UNDERLYING SELECTION OF 
‘TEsts FOR NOVEMBER, 1943, AND 
SEPTEMBER, 1944, CLASSIFICATION 

BATTERIES 


The choice of tests to be used in 
classification batteries such as those of 
November, 1943, and September, 1944, 
depended largely upon the traditional 
multiple-regression-equation _ principles. 
As a result of this statistical work, many 
tests were retained; some, modified; and 
a few new ones, added. Since a test 
battery had to serve a threefold purpose 
of selecting pilots, navigators, and 
bombardiers, several compromises had to 
be made in choice of tests for the 
batteries. Some tests, for example, con- 
tributed substantially to the validity ol 
the battery for one of the three groups 
mentioned, but negligibly for the two 
others. 

During the course of the war, the 
rationale underlying selection of tests 
rested more and more upon the factorial 
variance which these tests held in con- 


| 
| 
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mon with that of the criterion. Wherever 
possible, pure tests (containing variance 
in the relevant factor) were preferred to 
complex tests (containing substantial 
amounts of variance in two or more 
relevant factors). Several exceptions to 
this goal of purity were apparent—espe- 
cially in the use of psychomotor tests. 
Complex tests were retained when their 
absence would result in a marked re- 
duction in the overall validity of the 
battery. As the factorial composition of 
the pass-fail criteria became better 
known, tests were constructed to cover 
the identified areas of the criteria. The 
addition of new tests frequently revealed 
new factors, and subsequently other tests 
were designed to describe better these 
factors and to augment the validity of 
the battery. Consequently, it became 
convenient to think of test areas and 
of families of tests covering these areas 
of variance. 

Although as many as_ twenty-seven 
factors were revealed (at least tenta- 
tively), the major portion of the validity 
variance (for pilot, navigator, and 
bombardier groups) had been delimited 
to six or eight factors on the basis of 
previous analyses. For convenience; tests 
might be grouped into the following 
areas: mathematical, verbal, reasoning, 
judgment, general and mechanical in- 
formation, biographical data, space and 
visualization, perceptual speed, and ap- 
paratus,? 


C. DESCRIPTION OF TESTS 


In order that the identification of the 
factors may be more meaningful a de- 
scription of the tests in the two batteries 
follows: first, for the Classification 


‘For further information concerning the 
rationale for inclusion of tests in the two 
batteries, see 23, pp. 851-886. 
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Battery of November, 1943, administered 
to the population of Negroes; second, 
for the Classification Battery of Septem- 
ber, 1944, administered to the .popula- 
tion of West Point Cadets. As mentioned 
previously, the November battery con- 
sisted of eighteen tests, sixteen of which 
appeared in the September battery (with 
slight revisions in the composition of 
two and in the directions for one); in 
the September battery, five new tests 
appeared (making for a total of twenty- 
one tests). Hence, the following descrip- 
tion covers systematically each test in the 
November battery and those eight tests 
in the September battery of which five 
were unique additions and three were 
modifications of tests in the November 
battery. 


1. Tests of the Classification Battery of 
November, 1943 


a. Pencil-and-paper tests. The test of Reading 
Comprehension, Ci614H, planned to obtain a 
measure of verbal intelligence, consists of five 
paragraphs of technical material upon each of 
which the examinee must select the best answer 
in each of several multiple choice items of five 
alternatives. Paragraphs upon geography, me- 
chanics (physics), the psychology of vision, the 
compass, and map projection theory represent 
the content. The total number of items is 
thirty-six. With a thirty-minute time limit, the 
test is scored: 2R—W/2. 

Two tests of fifty items each of spatial orienta- 
tion were employed. Spatial Orientation Test 1, 
CP501B, was designed to test the ability of the 
examinee to identify parts of aerial photographs. 

On the upper half of each page of the test 
booklet is a large rectangular aerial photograph, 
approximately six inches by eight inches of 
likely military objectives (such as industrial 
cities traversed by rivers, or harbor cities). Fif- 
teen small circles of about one-fourth inch 
diameter are overprinted upon the aerial photo- 
graph in a random fashion. Each of these circles 
is lettered. On the lower half of each page are 
five or six larger circles (numbered) with di- 
ameters of approximately one and one-fourth 
inches, representing the areas to which certain 
ones of the fifteen small circles (lettered) on the 
rectangular photograph correspond. To each of 
the larger numbered circles (representing 
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number of test item) the examinee is to choose 
a smaller lettered circle (response) which 
matches. As a rule, the position of the territory 
in the larger circle is enlarged or diminished 
considerably and rotated from its corresponding 
position in the rectangular photograph, because 
the small photograph was taken from altitudes 
and angles different from those of the rec- 
tangular photograph. Planned to be a five- 
minute test, it is scored: R—W/5. 

Spatial Orientation Test II, CP503B, was de- 
signed to measure an examinee’s ability to locate 
on a map corresponding places shown on an 
aerial photograph. On the upper half of each 
page the rectangular section of the map is 
approximately four inches by six inches. The 
relief maps (similar to those in Goode’s School 
Atlas) employ various contour shadings of green 
and purple for ranges of elevation, yellow areas 
for large towns, circles for small towns, blue 
lines for rivers and streams, black lines crossed 
by short marks for railroads, purple lines for 
highways, and other miscellaneous symbols. Each 
map is sectioned off into twelve square areas 
labeled with capital letters (A to L). At the 
bottom of each page are four photographs, each 
enlarged ten times (bearing the number of the 
items) corresponding to a restricted portion of 
one of the square areas upon the map (correct 
alternative response). Both the photographs and 
the maps are oriented in the same manner 
geographically—north being the upper edge of 
the respective figures. In summary, the task is 
to find the area in the lettered block on the map 
corresponding to the enlarged photograph and 
to record the answer by blackening on the 
answer sheet the space containing the appropri- 
ate letter. For each double page, consisting of 
two maps and four respective photographs, the 
examinee is allowed three minutes. Since there 
are six double pages, the total testing time is 18 
minutes. The scoring formula is: R-—W/5. 

The Dial and Table Reading Test, CP622-21A, 
represents a combination of tests which in earlier 
batteries were administered separately. In the 
dial reading test the purpose is to determine 
how quickly and accurately the examinee can 
read the dials on an instrument panel. Seven 
imitation dials are labeled R.P.M., Air Speed, 
Altitude, Voltmeter, Temperature (with Oil and 
Fuel subdials superimposed), Fuel-Air Ratio, and 
Amperes. Ten sets of simultaneous readings are 
presented throughout the test to each of which 
six items, one for each instrument, are used 
(except in the tenth set where only three items 
are used). Each item contains five alternative 
responses—readings of the respective dial—from 
which the examinee selects the correct answer. 
For the 57 items following the practice set of 
three, nine minutes are allowed. 


The table reading test, designed to measure 
one’s ability to read tables quickly and ac- 
curately, is divided into two parts. In Part | 
a large table is set up with numbers ranging 
from —17 to +17 across the top and up the 
left side. The numbers at the top and sides are 
respectively labeled first value and second value. 
In the same manner as one reads a road map to 
ascertain the distance between two localities, the 
examinee given a first value number and a 
second value number has to go over to the 
column corresponding to the first value number 
and down this column to the space where the 
row corresponding to the second value number 
intersects the column. At the space of inter- 
section is either a two digit or three digit 
number representing the correct answer among 
four other incorrect alternatives in a multiple 
choice item, in which the first value and second 
value numbers stated together are the given 
stimuli. The testing time allowed for 42 items 
is eight minutes. 

In Part II of the table reading test, four tables 
are presented. Across the top of each table is 
a given air speed of 100, 120, 140 and 180 miles 
per hour. For different wind angles written in a 
column at the left-hand side by steps of 10 
degrees from o to go, three wind velocities of 
10, 15, and 20 miles per hour head three addi- 
tional columns. Each of these three columns is 
further divided into two sets (sub-columns) of 
direction correction and ground speed (corre- 
sponding to the wind angle at the air speed 
under consideration). Given for each item the 
amount of air speed, of wind velocity, and of 
wind angle, the examinee is asked to determine 
from reading one of th2 four tables either the 
direction correction or the ground speed. There 
are five choices for each item. The total testing 
time is seven minutes. Hence the total time for 
the three sub-tests is 24 minutes. For all items 
upon the three sub-tests the scoring formula is: 
R—W 


2 

The Biographical Data Test, CE602D, is scored 
twice in each battery, The raw scores upon the 
test are differentially weighted according to 
whether it is used in the pilot or navigator 
context. An outgrowth of several previous forms, 
the Biographical Data test was designed to 
measure family and social relationships, and 
background experiences, which have been shown 
valid in predicting success of pilots. The sixty- 
five items with varying numbers of alternatives 
to which the examinee is directed to give the 
most accurate answer possible pertain to the 
following areas: the socio-economic status, racial 
origins, and educational experiences of the 


members of his family; his degree of success in 
various school subjects and in athletic participa- 


q 
| 
| 
‘ 
FF 
| 


to 
the 

the 
iber 

the 
iber 
iter- 
ligit 
ong 
‘ond 
iven 
tems 


ables 
le is 
niles 
in a 
f 10 
es of 
addi- 
ins is 
is) of 
orre- 
speed 
1 the 
nd of 
rmine 
r the 
There 
esting 
ne for 
items 
ila is: 


scored 
on the 
ng to 
vigator 
forms, 
ied 0 
s, and 
shown 
» sixty- 
natives 
ive the 
to the 
, racial 
of the 
scess in 
rticipa- 


tion; his favorite hobbies and recreational ac- 
tivities; previous work experiences; and reasons 
for choice of aviation cadet training. Timed at 
twenty-five minutes, the test is scored: R—W + 
20 in which R is a positively weighted response 
and W, a negatively weighted response. 

The test of Mechanical Principles, consisting 
of two comparable forms, Clgo3A and ClIgo3B, 
in the respective batteries, was designed to 
measure the examinee’s knowledge and under- 
standing of mechanical principles. As in the 
Bennett Test of Mechanical Comprehension, 
diagrams of mechanical devices and familiar 
situations in which elementary principles of 
mechanics are involved are part of each item. 
These figures are labeled with letters and 
arrows (usually indicating a force vector or 
movement). Given certain relevant information 
concerning the problem represented by the 
diagram, the examinee responds to one of the 
alternatives, three, four, or five in number. 
Planned to take twenty minutes, the test of 
forty-one items is scored: R—W/2 + 20. 

The General Information Test, CE505E, was 
intended to measure what its title suggests. The 
one hundred items consisting in each instance 
of four or five alternative words, phrases, or 
clauses completing the meaning of the fractional 
part of the given stimulus sentence are dis- 
tributed as follows: Aviation Interest, 35 items; 
Sports and Hobbies, 13 items; Mechanical In- 
formation, 20 items; Driving Information, 12 
items; Flying Information, 20 items. Designed 
to take thirty-six minutes, the test is scored: 
R—W/4. 

‘Two tests in mathematics employed in the 
November, 1943, Classification Battery were de- 
signed to measure mathematical and reasoning 
ability. The first test, Mathematics A, Cl7o02F, 
containing thirty-five multiple choice items, em- 
bodies elementary problems primarily in algebra 
with a few geometric and trigonometric concepts 
introduced. Emphasis upon the so-called 
“thought” problems is minimized in Test A. 
With a time limit of twenty-five minutes, this 
lest_ is scored: R—W/4. 

The second test, Mathematics B, Cleo6C., fre- 
quently designated as Arithmetic Reasoning, 
Cl206C, was developed to measure primarily 
reasoning ability. The thirty multiple choice 
test items closely resemble the typical “word” or 
“thought” problems with which students in 
general mathematics and algebra courses in high 
schools seem to have difficulty. With a time 
limit of thirty-five minutes, this second test is 
scored: 2R—W/2. 

Similarly, two tests in instrument compre- 
hension were used in November, 1943, Classi- 
fication Battery. The first, Jnstrument Compre- 
hension 1, C1615B, was designed to evaluate the 
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examinee’s ability to read and to interpret 
instrument dials. The instrument panel used in 
each item consists of a series of dials (altimeter, 
artificial horizon, compass, rate of climb, air 
speed, and turn-bank) shown on one page. Op- 
posite each item are five choices that describe 
the actions of the plane in phrases consisting of 
verbal and numerical terms. With a time limit 
of twelve minutes, the test is scored: 20 — R 
+ W/4. This unusual formula was used because 
during the short time this test was included in 
the classification battery, it had a negative re- 
gression weight. This formula permitted the use 
of a positive weight in machine combination 
procedures. 

The second test, Instrument Comprehension II, 
CI616B, attempts to measure one’s ability to 
interpret the position of a plane from the read- 
ings of two instruments: the compass and arti- 
ficial horizon—instrument flying. As the reader 
knows, the compass shows the direction of the 
plane. The artificial horizon dial indicates 
whether the plane is climbing, flying level, or 
diving; banking left, or banking right. Combi- 
nations of one of the former three motions and 
of one of the latter two motions may be in- 
terpreted from this dial. Five pictures of planes 
in different positions (alternative responses) ac- 
company each new set (item) of artificial hori- 
zon and compass readings. Limited to eighteen 
minutes, this test is scored: R-W/4 


b. Apparatus Tests. Among the six psycho- 
motor tests employed, the most familiar to the 
psychologist is probably the Rotary Pursuit, 
CP410B. Designed to measure the degree of 
skill acquired in eye-hand coordination in 
which sweeping arm and hand movements are 
required, this test has been given under two 
conditions: without divided attention and with 
divided attention. In both the November and 
September batteries, the test with divided at- 
tention was used. In the simple rotary pursuit 
task (without divided attention) the examinee 
attempts to maintain his stylus on a small tar- 
get on the disk, revolving at 60 revolutions per 
minute. Hence the arm moves in a free and 
gross circular manner. In the rotary pursuit task 
with divided attention, the distracting device 
consists of a second task. Inside a box are two 
lamps, connected to adjacent keys. During trials 
6 to 15 of the simple rotary pursuit task, the 
lamps alternate in lighting in an irregular (but 
controlled) manner. In order that his pursuit 
score may be recorded, the examinee has to 
depress with his free hand the key correspond- 
ing to the lighted lamp. A total of fifteen trials, 
twenty seconds each, serves as the basis for the 
score: the total time of contact with the target 
in hundredths of a minute. 
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No penalties were used in scoring the six 
psychomotor tests. Simple conversion tables 
were employed in transforming all scores on 
psychomotor tests into standard scores. This 
eliminated test machine differences. 

The Complex Coordination Test, CM7o1A, 
measures, as its name suggests, the degree of 
skill and speed with which an examinee can 
perform a series of complex reactions. A chair 
and stick are provided as well as a panel in 
front of the examinee containing three sets of 
parallel rows of potentially red and green lights. 
In any given trial, three red bulbs are illumi- 
nated. A lighted bulb on the top arc shows the 
lateral position of the stick required; one in 
the middle vertical row (perpendicular to the 
top arc and to the bottom horizontal row) sig- 
nifies the required forward-backward placement 
of the stick; and the one along the bottom 
horizontal row denotes the required position of 
the rudder control. When the examinee moves 
the stick and rudder control, green bulbs il- 
luminate in rows parallel to those in which the 
red lights appear. The assigned task is to match 
the three green lights to their corresponding 
red lights as quickly as possible. Whenever one 
problem is solved, a new combination of three 
red lights is shown. Following a two-minute 
practice period, the performance score is the 
number of sets matched in eight minutes. 

The Finger Dexterity Test, CMii6A, was de- 
signed to measure the degree of skill acquired 
in eye-finger coordination that requires rapid 
and precise movements of the fingers and wrist. 
In this test the examinee fits pegs into a peg 
board, turning the pegs to the right in the 
process. Square-shaped pegs about two inches 
long are painted blue on one vertical side and 
yellow on the opposite vertical side. At the start, 
either all blue or all yellow sides are toward 
the examinee. He is directed to lift each peg 
from its hole, to turn it clockwise 180° and 
to re-set it in the hole. Following one practice 
trial, five test trials of thirty-five seconds each 
are administered. The score is the total number 
of pegs turned by the right hand. 

The Discrimination Reaction Time Test, 
CP611D, the origin of which obviously may be 
traced to the psychological laboratory, was 
planned to measure the speed of reaction de- 
termined by the amount of time required for the 
examineee to press one of four switches in re- 
sponse to four possible patterns of green and red 
lights on a panel in front of him. Two lights are 
green; two, red. The upper left and lower right 
bulbs are red. For each trial one red lamp and 
one green lamp are lighted in a randomized 
arrangement. In terms of the position of the 
red light with respect to the green light (four 
possible combinations), the examinee responds 
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by throwing one of four switches. For example, 
if red is at the left of green, he throws the 
left-hand switch. When the correct response is 
made, an illuminated white bulb located above 
the four others, goes out. Following ten prac- 
tice trials, eighty test trials of four series of 
twenty settings were given. Recorded automati- 
cally by electric clocks, the total of the reaction 
times is the score. 

The Two-Hand Coordination Test, CM101A, 
was designed to measure one’s ability to coor- 
dinate the movements of both hands in the per- 
formance of a complex motor task. At the slow 
rate of one revolution per minute a target on a 
disk moves in a circle and at the same time runs 
slowly back and forth in a slot upon the disk. 
In an attempt to keep the pointer upon the 
irregularly moving target, the examinee can 
move the pointer forward or backward by turn- 
ing with his right hand a crank; right or left, 
by turning another crank with his left hand. 
The coordinated simultaneous movement of 
both hands is required for a good score. For 
eight trials of one minute each the total amount 
of time that the examinee keeps the pointer 
on the target represents the score. The contact 
time, expressed in hundredths of a minute, is 
automatically recorded by an electric clock. 

The Fernald Rudder Control Test, CM120B, 
was designed to measure the examinee’s ability 
to effect appropriate motor responses to changes 
in position indicated by visual and kinesthetic 
cues, The apparatus consists of a seat and rud- 
der bar mounted on a boom which is placed 
on a swivel. With the center of gravity in front 
of the swivel, a state of disequilibrium is evi- 
denced in the constant tendency of the appara- 
tus to swing to one side. When force is appro- 
priately exerted on the’ rudder bar, a resulting 
change occurs in the tension of the control 
springs. With equilibrium temporarily restored, 
the device is returned to a straight-ahead posi- 
tion. Other parts of the apparatus include a 
control stick (not functionally related to the 
uprighting of the device) and a cowling along 
which the examinee aims at a target. Following 
a one-minute practice trial, five test trials of 
one minute each are scored in the unit of num- 
ber counts (recorded on a counter) in .25 
seconds, 


2. Tests of the Classification Battery of 
September, 1944 


In both the November, 1943, Classification 
Battery and the September, 1944, Classification 
Battery the same six apparatus tests were used. 
Among the printed tests the following were 
identical in the two batteries: Reading Com- 
prehension, C1614H; Spatial Orientation |, 


ar 
D 
W 
R 
co 
or 
ar 
Ni 
0. 
St 
m 
re 
tic 
m 
ca 
Cl 
Cl 
we 
Ba 
Tl 
M 
Ce 
co 
tic 
a 

fo 
th 
ste 
th 
ce 
te 
fa 
di 
of 
ar 
sa 
sp 
in 
bi 
m 
lit 
fo 
th 
m 
In 
ba 
ap 
Wi 
en 
co 
ea 
we 
ex 


FACTOR ANALYSES OF 


CP;01B; Spatial Orientation II, CP503B; Dial 
and Table Reading, CP622-21A; Biographical 
Data, CE602D (both Navigator and Pilot 
Weightings), and Mathematics B (Arithmetic 
Reasoning), CI206C. 

Three tests in the more recent battery were 
comparable to those of the earlier battery, since 
only slight revisions were made. Five new pencil 
and paper tests appeared in the second battery: 
Numerical Operations Front, Cl702B; Numerical 
Operations Back, Cl¥z702B; Practical Judgment, 
Clgo1C; Mechanical Information, Clgo5B; and 
Speed of Identification, CP610A. The test Mathe- 
matics A, Clzo2F, of the earlier battery was 
replaced by two tests upon numerical opera- 
tions. 

For the three tests of the earlier battery only 
minor changes were effected. Instead of Mechani- 
cal Principles, CIgogA, and General Information, 
CE505E, the revised tests Mechanical Principles, 
ClgogB, and General Information, CEso5F, 
were used in the September, 1944, Classification 
Battery administered to the West Point cadets. 
The differences between the two forms of the 
Mechanical Principles test are negligible. In the 
Composite Factor Analysis Summary, only in- 
consequential differences in the factor composi- 
tion of these two tests are present. Aside from 
a few changes in certain items these two forms 
for all intents and purposes are equivalent. 

In the test of General Information, CE5o5F, 
the number of multiple choice items is 110 in- 
stead of 100, as in form CEso05E. In Part I of 
the revised test, the content of 50 items is con- 
cerned with aviation interest: definitions of 
terms, identification of planes, and miscellaneous 
facts of aviation. As for Part II, sixty items of 
diverse content were used requiring a knowledge 
of contemporary history, literature, geography, 
art, music, sports, and such activities as motoring, 
sailing, and hunting. Supposedly, correct re- 
sponses to items of such content were to be 
indicative of the examinee’s interests and hob- 
bies. For each part the time limit is twenty 
minutes. Although both forms CEs505E and 
CE505F of the General Information Test are 
limited to forty minutes, the scoring formula 
for the revised form is R instead of R—W/4, of 
the earlier form. 

The last of the three tests in which relatively 
minor changes occurred in the revised forms was 
Instrument Comprehension. In the September 
battery Instrument Comprehension I did not 
appear, and Instrument Comprehension, C1616C, 
was a revision of the form CI6i6B. The differ- 
ence in the forms of the two tests lies, not in 
content, but in the directions employed. In the 
earlier batteries in which CI6i6A and CI616B 
were used, the directions depended upon the 
examinee’s first having read those pertaining to 
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Instrument Comprehension I, Cl615A and 
C1615B, respectively. In the September, 1944, 
Classification Battery the directions for Instru- 
ment Comprehension, C16i6C, were rewritten, 
since Instrument Comprehension I was absent. 

Of the five new tests added to the September 
1944, Classification Battery, two were mathe- 
matics tests, designed to measure the examinee’s 
accuracy and speed in the performance of fun- 
damental numerical operations. In Numerical 
Operations Front and Back, Cl702B, the front 
consists of simple examples in multiplication 
and addition; the back, of elementary problems 
in division and subtraction. For each item on 
the front an answer is given which the examinee 
indicates to be correct or wrong by blackening 
one of two spaces. On the back five answers are 
presented for each item, one of which is correct. 
The total number of items is 174. In each test 
the amount of time allowed is five minutes (total 
time: 10 minutes). The scoring formula for both 
front and back is: R—3W 


2 

Another new test in the September battery 
was Practical Judgment, Clg01C, designed to 
measure the examinee’s ability to solve practi- 
cal problems. In each of the thirty items, a 
practical problem involving activities in the 
Army is described in two or three sentences with 
conditions specified. Many of the problems 
represent situational predicaments. To each item 
four or five choices of action are presented, the 
best one of which the examinee is to select. 
With a total time of thirty minutes allowed, 
the test is scored: 2R—W/2. 

The fourth new test was Mechanical Informa- 
tion, CIg05B, purporting to measure the amount 
of information the examinee has concerning the 
function and operation of mechanical devices. 
In most of the thirty items the examinee selects 
one of three, four, or five alternative answers 
that describe best the function of a familiar 
mechanical gadget or the probable cause of a 
described mechanical difficulty. With a time limit 
of twelve minutes, the test is scored: R—W/3. 

The last additional test was Speed of Identi- 
fication, CP610A, developed to measure the ac- 
curacy of form perception through use of air- 
plane silhouettes. The items are presented in 
twelve blocks of four each—a total of 48 items. 
Within each block, four planes (items) in the 
form of blue silhouettes are oriented in the 
same position. Although upon initial percep- 
tion they appear quite similar, minute differ- 
ences are apparent in each one. Slight differ- 
ences in shapes of wing, fuselage, gear, and tail, 
for example, appear. Corresponding to each 
item of a given block are five alternative choices 
of planes in silhouette form (also very similar 
in appearance) which are rotated at different 
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angles from the given standard position of the 
stimulus plane. The examinee is to pick for 
each stimulus plane in the first block one of 
the five (response) planes within a second block 
adjacent to the first block (of stimuli). Within 
each of the twelve stimulus blocks the appear- 
ance of the four stimulus planes is relatively 


homogeneous, but between the different blocks 
of stimulus planes marked heterogeneity is ap- 
parent in that the planes are of diverse models. 
A similar statement may be made concerning 
the twelve blocks of response planes. Clocked at 
only four minutes, the test is scored as R—W. 
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N THE FOLLOWING two major divisions 
I concerning statistical procedures em- 
ployed in the investigation, the order of 
the described techniques follows closely 
the sequence of the two major questions 
and of the sub-questions in the statement 
of the problem (I, B, 1). The portion 
upon factor analysis parallels the first 
major question, and the next section 
upon multiple regression equations in 
the prediction of the pass-fail criterion 
scores Closely conforms to the second 
major question. Emphasis is placed upon 
the statistical operations used rather 
than upon interpretation of results. 


A. THe Facror ANALYSIS OF THE Two 
MATRICES OF INTERCORRELATIONS 


As mentioned previously, the Thur- 
stone system was employed exclusively 
in the factor analyses of the two matrices 
of intercorrelations of test in the classifi- 
cation batteries of November, 1943, and 
September, 1944, administered respec- 
tively to the two populations of Negroes 
and West Point Cadets. All intercorrela- 
tions were of the product-moment type 
with the exception of the biserial coef- 
ficients of correlation (equivalent to a 
Pearsonian r) between the test scores and 
the pilot pass-fail criterion. The matrices 
of intercorrelations appear in Table 2 
and Table 3. 


1. Limitations of Intercorrelations Used 
in the Two Factor Analyses 


For the population of Negroes, the biserial 
correlations placed in the correlation matrix 
(Table 2) were corrected for restriction of range 
which resulted from the elimination of approxi- 
mately 16.3 percent of the candidates (58 of the 
$56 examinees were disqualified for pilot train- 
ing). It is not expected that this proportion 
of elimination would materially affect the 
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factorial composition of the eighteen tests of the 
classification battery. However, the loadings of 
the factors in the pass-fail criterion probably 
would be systematically underestimated, since 
the biserial correlation coefficient corrected for 
restriction of range would be larger than one 
not corrected. 

For the population of West Point Cadets, a 
marked Jack of normality might be expected in 
the distribution of test scores. Fortunately, how- 
ever, the ceilings of the tests were sufficiently 
high that in most instances approximately 
normal distributions of test scores resulted. 
Moreover, for each set of test scores, the amounts 
of variance present for both the West Point 
group and the Negro cadets were approximately 
the same as those of groups of representative 
white aviation-cadets of about the same size. For 
the West Point population, the biserial correla- 
tions of the pass-fail criterion with test scores 
were based upon a number of subjects fewer 
than those given the test battery in that only 
$55 Out of 815 tested elected pilot training. 
However, a marked difference between the two 
populations with respect to the reduced num- 
ber of cases in the pass-fail criterion was ap- 
parent. No systematic elimination of pilot cadets 
on the basis of a low stanine score resulted for 
the West Point group. Hence, no underestima- 
tion of the factor loadings of the pass-fail cri- 
terion should appear, although such loadings 
might be subject to somewhat greater sampling 
errors than those for the twenty-one tests which 
are based on more than twice as many Cases. 


2. Extraction of Centroid Factors 


For both matrices of intercorrelations the 
technique of extraction followed precisely the 
procedure outlined by Guilford (g). For the 
November, 1943, battery (administered to Ne- 
groes) two sets of extractions were required. In 
the first extraction the computed communalities 
of five tests differed more than .10 from the 
estimated communalities used at the beginning 
of the analysis (the discrepancies being +.14, 
+.12, +.12, +.12, and —.12). For the Septem- 
ber, 1944, battery (administered to West Point 
Cadets) only one extraction of centroid factors 
was necessary, since the discrepancies between 
estimated communalities and computed com- 
munalities were all less than .10. 

Criteria for the cessation of extraction are 
numerous. Experiences of AAF psychologists of 
the research units revealed that many criteria 
used prior to the War were too conservative. 
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rxtraction of additional factors instead of add- 
ing mere error variance actually contributed to 
the realization of improved rotations and hence 
more meaningful factors. One criterion for 
cessation widely used by the members of the 
psychological research units was that if the 
product of the two absolutely highest factor 
loadings on the nth extraction was less than 
the standard error of zero correlation for the 
size of sample used (assuming null hypothesis) 
then the n-1 extraction is the last one to be 
included, Symbolically, if |ka: x ka;| < ro, then 
the extraction should cease at the n-1 centroid 
factor. 

For the correlation matrix of the November 
battery, the product of the highest two loadings 
of the eighth centroid factor, .179 x .176, was 
equal to .0g15, which was less than .o51, the 
standard error of zero correlation for a sample 
of 956 (354 degrees of freedom). Accordingly 
seven factors were rotated. (After the first seven 
rotations the eighth factor was introduced in 
order that the other factors might be better 
defined. Elaboration of this point follows in a 
subsequent paragraph: technique of rotation.) 

For the correlation matrix of the September 
battery the product of the highest two loadings 
of the ninth centroid factor (+.146) x (—.138) 
was equal to .o201, which was less than .0351, 
the standard error of -zero correlation for a 
sample of 815 (813 degrees of freedom), There- 
fore eight factors were rotated. (After 44 rota- 
tions, the ninth factor had to be introduced in 
order to obtain a more meaningful solution—a 
point subsequently to be explained.) 

Another indication of the adequacy of a 
centroid extraction is the dispersion of the 
residuals of the last centroid extracted. For the 
matrix of intercorrelations of the November, 
1943, battery the 190 seventh factor residuals 
ranged from —.119 to +.073. The standard 
deviation of this distribution was .0283. 

For the matrix of intercorrelations of the 
September, 1944, battery the 253 eighth factor 
residuals ranged from —.on3 to +.061. The 
standard deviation of this distribution was 
0173. For both matrices the standard deviations 
o! the distribution of residuals were considera- 
bly less than the standard errors of zero correla- 


ion for the respective numbers of degrees of 
freedom. 


3. Techniques of Rotation 


Graphical rotation by the Zimmerman method 
(20) was employed for both matrices of inter- 
‘orrelations.! The purpose of the rotation was 


' This outstanding innovation enables an ex- 
pertenced technician to perform as many as 
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to realize both simple structure and positive 
manifold. In other words, an attempt was made 
to obtain a maximum number of entries of a 
value near zero and to reduce the number and 
size of negative loadings. 

For the matrix of intercorrelations of the No- 
vember, 1943, battery (administered to Negroes) 
seven factors were used in the first seven rota- 
tions (in adherence to the criterion of cessation 
of extraction). However, since at the conclusion 
of the seventh rotation tests known from previ- 
ous analyses to be loaded with numerical, 
reasoning, and verbal variances, were clustered 
about an axis, the eighth centroid factor was 
introduced in order to separate, if possible, any 
possible components of what appeared to be a 
general intellectual factor. Variance in the so- 
called intellectual factor did separate: variance 
identified as verbal remaining on what had been 
the intellectual factor and the remainder of 
the variance, evidently number and reasoning, 
going over to the eighth orthogonal factor. Since 
the two tests of the battery most heavily loaded 
in this “doublet” factor had been revealed in 
previous analyses to be complex with respect 
to both reasoning and number, and since no 
test known to be relatively pure with respect to 
either factor was included in the battery, the 
outcome was not surprising. The final factor 
loadings at the conclusion of the forty-fifth 
rotation are presented along with test com- 
munalities in Table 4. 

For the matrix of intercorrelations of the 
September, 1944, battery (administered to West 
Point Cadets) the first eight centroid factors 
were subjected to forty-two rotations. The 
seventh factor appeared difficult to define, ‘since 
it contained a high loading (.480) in the pass- 
fail criterion and a substantial loading (.429) in 
only one test (Biographical Data—Pilot, CE602D). 
Without success, several trial rotations were 
undertaken to increase the loadings of this 
factor on other tests in order that identification 
might be facilitated (and at the same time to 
fulfill the requirements of both positive mani- 
fold and simple structure). Finally, the ninth 
factor, which according to the criterion em- 
ployed for cessation need not be retained, was 
introduced in several trial rotations with the 
seventh factor and others in the expectation 
that both factors seven and nine might be 
identified. Again several trial rotations failed 
to reveal a solution satisfactory to the require- 
ments of positive manifold and simple structure 


fifteen rotations of pairs of axes per hour on 
which twenty test variables have loadings. It is 
highly probable that the Zimmerman technique 
will soon replace the more cumbersome and 
inefficient procedures now commonly employed 
by many factor analysts. 
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CP611D 
CPi116A 


Pass-Fail Criterion 


Discrimination Reaction Time 
Finger Dexterity 
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and still open to a psychologically meaningful 
interpretation. Since all loadings on the seventh 
factor other than those of the pilot pass-fail 
criterion and one test were negligible, the only 
remaining approach appeared to be that of 
attempting to build up factor nine and to reduce 
factor seven to the role of a residual (a factor 
containing small positive and negative loadings). 
After twenty-three additional rotations, factor 
nine was loaded highly (.436) in the criterion 
and substantially in two tests (.531 and .352), 
and factor seven became a residual with both 
positive and negative loadings within the narrow 
range of —.137 to .130. A psychologically mean- 
ingful interpretation of factor nine appeared 
feasible. The factor loadings and communalities 
for tests at the conclusion of the final rotation 
are given in Table 5. 

On the basis of several factor analyses the 
weighted loadings of the factors and com- 
munalities, as well as reliabilities and validities 
of the test, have been reported in the Composite 
Factor Analysis Summary (24). These loadings 
will be taken to represent those of a representa- 
tive white aviation-cadet population. 


4. Satisfactory Fulfillment of the 
Requirements of Rotation 


The criterion of positive manifold was 
reasonably well satisfied in the final rotations 
of both matrices. After the final rotation (the 
forty-fifth) of the eight centroid factors extracted 
for the November, 1943, battery, only eight 
loadings were equal to or algebraically less than 
—.100 (the highest negative loading being 
—.190). In all, thirty-two negative entries ap- 
peared for the eight rotated factors. 

For the final rotation of the matrix of the 
West Point population the highest negatives in 
excess of —.100 in factors (other than the 
residual) were —.159, and —.127. 
Twenty-seven other negative loadings in the 
eight meaningful factors ranged between —.099 
and .ooo, with only nine being between —.o99 
and —.o50. In the residual factor eleven load- 
ings were positive, eleven negative, although 
the highest negative was only —.137 (and the 
highest positive only .130). 

In the rotation procedure for both matrices 
an attempt was made wherever possible to 
maximize the number of vanishing loadings. In 
view of the fact that most of the tests in both 
batteries were complex in previous analyses, it 
Was not surprising that final rotations for both 
matrices of intercorrelations revealed a not too 
high degree of simple structure. If loadings 
absolutely less than .150 are considered to be 
insignificant, inspection of Table 4 shows that 
for the November battery (administered to 


—-175, 
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Negroes) the number of insignificant weights 
for each of the eight factors was respectively as 
follows: 11, 11, 12, 7, 14, 11, 12, and 10, Similarly, 
for the September battery the number of in- 
significant loadings for the eight real factors 
was respectively: 10, 15, 14, 12, 14, 7, 11, and 
15. Despite the complexity of the tests, a fair 
degree of simple structure was evident for both 
matrices in the final rotation. 

Another condition to be fulfilled which in 
mathematical terms is absolute was that the 
sum of common-factor variances (communality) 
of a test after rotation must equal its common- 
factor variance before rotation (centroid factor 
variance). In other words, the sum of common- 
factor variances of a test, or the magnitude of 
a test vector, remains invariant under rotation. 
In the Zimmerman method of rotation, negligible 
discrepancies accrue after numerous rotations 
to the extent that any inaccuracies in scales 
employed or any degree of bluntness of a pencil 
point are present. 

At the conclusion of the rotations of the 
factors of the November, 1943, battery, minor 
discrepancies between the communalities of tests 
before rotation and after rotation existed. With 
the communality of each test in the centroid 
factors taken as the standard of reference, the 
common-factor variance following the forty-fifth 
rotation deviated from the assumed standard 
between —.0o12 and +.002. In general, the 
discrepancies were negative. 

In the rotations of the second matrix the 
discrepancies ranged between —.o0g and +.006, 
again with a tendency for a slight underesti- 
mation. However, these discrepancies were of no 
practical consequence. The requirement of in- 
variance was considered to be satisfied for both 
matrices. 


B. MULTIPLE REGRESSION EQUATIONS IN 
THE PREDICTION OF EACH PILOT PAss- 
Fait CRITERION FROM TESTS OF THE 
RESPECTIVE BATTERIES 


In order that the validity of each test 
battery in terms of the multiple correla- 
tion between the pass-fail criterion and 
the test scores (optimally weighted) might 
be determined and in order that beta 
weights of these tests might be ascer- 
tained for inclusion in multiple regres- 
sion equations, the Doolittle method as 
outlined by Guilford (2, pp. 263-268) 
was employed. 
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Ezekiel’s correction (1) was employed 
for the coefficient of multiple determina- 
tion (R*) and the coefficient of multiple 
correlation (R), in order that an un- 
biased estimate of the most probable 


values of the parameters in the universe 
corresponding to these statistics might 
be obtained. (These data are presented 
in Table 15.) 


ight 
ited 


N CHAPTER I, two major questions 
it were proposed in the statement of 
the problem. In Chapter III, the survey 
of statistical procedures closely followed 
the sequence of these two major ques- 
tions. Emphasis was placed upon a ra- 
tionale for, and description of, the sta- 
tistical operations employed. In the cur- 
rent chapter, the three major divisions 
are concerned with an interpretation of 
the statistical results. These divisions 
also parallel the organization of the 
statement of the problem presented in 
Chapter I. 

In part A, for the two populations 
of West Point Cadets and Negroes, the 
factors will be tentatively identified and 
interpreted. Among the three groups of 
West Point Cadets, Negroes, and the 
white aviation-cadets? (in general, pilots), 
comparisons will be made of the amounts 
of loadings of the identified factors in 
tests and in the pilot pass-fail criteria. 
Hypotheses will be set up to account for 
marked differences in the weights of the 
factors in tests common to either two 
of, or three of, the groups and for dif- 
ferences in the factorial composition of 
the three pilot criteria. 

In part B, an interpretation of the re- 
sults of the traditional multiple-regres- 
sion equation technique for the two pop- 
ulations of West Point Cadets and 
Negroes will be undertaken. The two 
points of major emphasis will include 
(2) a comparison of the validity of each 
test battery in terms of the coefficient 
of multiple correlation between the pilot 


‘As mentioned previously, the data for the 
representative white aviation-cadet population 
Were taken from (24). In most instances the 
loadings are for pilot cadets. 
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pass-fail criterion and the test scores of 
the battery optimally weighted, with the 
factorial validity estimated from the 
total common-factor variance of the re- 
spective pilot criteria, and (b) a com- 
parison of the proportion of variance- 
contribution of each optimally weighted 
test to the coefficient of multiple de- 
termination between the pilot criterion 
and the tests of each battery. 


A. AN INTERPRETATION OF THE FACTORS 
AND A COMPARISON OF THEIR LOAD- 
INGS IN TESTS AND IN THE PILOT 
CRITERIA FOR THREE AVIATION- 
CADET POPULATIONS 


For convenience, seven of the nine fac- 
tors to be identified and interpreted 
are somewhat arbitrarily grouped as in- 
tellectual functions of verbality, number, 
and reasoning; as factors of perception 
and spatial relations; and as factors of 
mechanical experience and pilot interest. 
Psychomotor-coordination and kinesthe- 
sis are considered independently. For 
the three populations of West Point 
Cadets, of Negroes, and of representa- 
tive white aviation-cadets, mostly pilots, 
comparisons will be made of the factor 
loadings in tests administered to the 
West Point Cadets and Negroes and of 
factor loadings in the pilot pass-fail cri- 
terion. In order that each factor may be 
more readily defined and discussed, a 
weight of a final rotated factor equal to 
or exceeding +.175 in any test admin- 
istered to either the West Point popula- 
tion or the Negro group is listed in ac- 
companying tables, along with the load- 
ings in the test for the other two popula- 
tions (whether these loadings be greater 
than, equal to, or less than the arbitrarily 
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chosen weight of .175). In those instances 
in which a test is unique to either the 
November, 1943, or September, 1944, 
battery, an additional factor loading, of 
course, can be reported only for the 
white cadet population. As an economy 
measure in the tables to follow the 
roman numerals I, II, and III corre- 
sponding respectively to the groups of 
West Point Cadets, Negroes, and (repre- 
sentative) white cadets are to be placed 
at the head of each of the last three 
columns in which the loadings with re- 
spect to a given factor are listed for the 
several tests. At the conclusion of each 
list of tests for which the factor weights 
are reported is an entry for the pilot 
criterion, 

Throughout the interpretation of the 
factors, hypotheses will be set up to ra- 
tionalize noticeable differences among 
the loadings in the same test or in the 
same set of tests in two of, or three of, 
the groups. As a conclusion to the dis- 
cussion of each factor, brief mention will 
be made of the importance of the factor 
for each of the three populations to the 
predictive value, or validity, of the test 
battery primarily in terms of the mag- 
nitude and the sign of the loading in 
the criterion. 


One of the fundamental weaknesses 
of the Thurstone system of factor analy- 
sis is that the difference between two 
(or more) factor loadings derived from 
either the same matrix or from different 
matrices cannot be tested for statistical 
significance. In other words, there is no 
means for computing the standard error 
of a factor loading. The reason for this 
is that in rotation an infinite number of 
solutions is possible.? The slightest ro- 
tation results in a change in the magni- 


tude of the projections of the test vector 
upon the axes entering into the rota- 
tion. To the extent, however, that the 
‘Thurstone requirement of positive mani- 
fold and simple structure is fulfilled, the 
range within which the factor loadings 
may vary is somewhat limited. If these 
two criteria are fulfilled, limited com- 
parisons are feasible. As the number of 
observations entering into the computa- 
tions of the original intercorrelations in- 
creases, the greater, of course, is the 
stability of the factor loadings derived. 
Even with a sample of several thousand, 
the actual significance of the difference 
between two loadings can never be 
known in terms of any stated degree of 
probability. Therefore, in any compari- 
sons to be made between the loadings 
of a given factor in the same test or in 
different tests for two or more of the 
groups, considerable caution and_ re- 
straint should be exercised. 


In the first chapter, the importance 
of a factor as an intervening variable, 
or theoretical construct, was emphasized. 
Although the naming, or labeling, of a 
factor is convenient for purposes of refer- 
ence, the emphasis in an operational ap- 
proach is placed upon the existence of 
a real variable of a reasonable degree of 
stability which stands in functional rela- 
tionship to certain antecedent conditions 
such as test items (independent varia- 
bles) and consequent conditions of hu- 
man behavior such as responses to the 
items (dependent variables). The basis 
of description of a factor rests upon 
certain more or less common properties, 
or requirements, which are common to 


*In fact, with n axes, the number of possible 
solutions is 0o"-'. After the first centroid axis is 
determined, each of the n-1 axes may assume 
an infinite number of positions. 
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certain tests (the antecedent conditions, 
or S-variables, referred to in I, C, 1) and 
not common to others. In reference to 
these properties, certain communalities 
of responses (the consequent conditions, 
or R variables) are present. Through 
application of statistical procedures of 
correlation and, factor analysis, these 
communalities of response appear in the 
form of loadings on factors. The names 
applied to these factors are subject to 
change and to modification, but the 
presence of the factor by the operational 
definition is relatively stable. Hence, any 
of the descriptions of factors to follow 
might better be considered as tentative 
hypotheses, subject to modification, 
which serve to communicate to the reader 
the presence of useful and perhaps en- 
durable categories. 

In short, whereas the definition of a 
factor is more or less arbitrary (often 
depending upon what psychological 
terms are in vogue), the existence of the 
factor is relatively certain, since its 
presence usually can be repeatedly dem- 
onstrated under controlled conditions of 
testing. Therefore, the following labels 
of factors stand for abbreviated hypothe- 
ses. Finally, introspective references are 
employed wherever their use makes the 
interpretation of a factor more compre- 
hensible. 


For convenience of reference to the 
size of numerical values of factor load- 
ings, five descriptive terms, arbitrarily 
chosen, are to be used. A loading of .500 
or greater is said to be high; one be- 
tween .400 and .499, substantial; one 
between .goo and .399, moderate; one 
between .175 and .299, slight; and finally, 


one less than .175, insignificant or neg- 
ligible, 
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1. The Intellectual Factors of Verbality, 
Reasoning, and Number 

a. Verbal factor. One of Thurstone’s 
primary mental abilities, the verbal fac- 
tor, is identified with those tests in which 
the comprehension of meanings of words, 
singly or collectively, and of the ideas 
associated with them is present (17). Tests 
of reading comprehension, of vocabulary, 
and of intelligence (scholastic aptitude) 
in the past have revealed high loadings 
in this factor. Among the tests of the 
two classification batteries, fourteen are 
loaded .181, or more, for one or more 
of the three groups. The loadings of 
these fourteen tests in the verbal factor 
are presented for the three groups in 
Table 6. | 

The test most indicative of a verbal 
factor is that of Reading Comprehension, 
with comparable factor weights of .625, 
.526, and .600 for the three groups of 
West Point Cadets, of Negroes, and of 
white cadets. The substantial saturations 
of the two mathematics tests in the 
verbal factor are not surprising, in view 
of the fact that a verbal statement of 
most problems is presented. Despite the 
fact that the verbal factor loadings of 
the three groups in Mathematics B ap- 
pear to be slightly lower than those of 
Mathematics A, the importance of ver- 
bal comprehension in Mathematics B is 
particularly apparent in the problems 
designed to measure reasoning—word 
problems similar to those encountered 
in high school algebra. Similarly, in the 
three tests: General Information, CE505E, 
General Information, CE5;05F, and Me- 
chanical Information, all the multiple- 
choice items consist of verbal statements. 

In such a complex test as Dial and 
Table Reading, the loadings for the 
three groups in the verbal factor are 
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TABLE 6 
VERBAL-FACTOR LOADINGS IN DEFINITIVE TESTS, AND IN RESPECTIVE PILot CRITERIA IN 


Test and Code Number 


Group 
II 


Reading Comprehension Cl614H 625 526 60 
Mathematics B (Arithmetic Reasoning) Cl206C 404 420 27 
(General) Mathematics A Cl702F 443 37 
Dial and Table Reading CP622-21A 237 435 10 
General Information CEsosk 130 43 
General Information CEsosF 244 37 
Instrument Comprehension C1616C 288 17 
Instrument Comprehension I Cl615B 387 22 
Instrument Comprehension I! C1616B 290 24 
Practical Judgment Cl301C 486 46 
Spatial Orientation I CPs501B 181 046 08 
Spatial Orientation II CPs503B 305 —030 14 
Mechanical Information Cloos5B 226 26 
Mechanical Principles CI903B 233 03 
Pilot Criterion 050 098 —05 


* West Point cadets. 
> Negro aviation cadets. 

¢ Representative white aviation-cadets. 
readily explained by the lengthy direc- 
tions and illustrative problems which 
are expressed in verbal terms. Moreover, 
words are overprinted upon the dials 
and tables. Unless an examinee is able 
to associate the detailed directions at 
the beginning of every part of the test 
with subsequent items, and to relate 
verbal symbols in the items themselves 
to the numbers presented upon the dials 
and tables, he obviously would be unable 
to respond in an adequate manner to 
the simplest test item. In the tests upon 
spatial orientation and instrument com- 
prehension, the detailed directions again 
appear to account for the slight verbal 
factor weights, inasmuch as a successful 
performance upon these tests (novel to 
most of the examinees) requires a 
thorough understanding of the direc- 
tions. The loading of .387 in Instrument 
Comprehension I, somewhat higher than 
those loadings found in the other two 
forms of this test, may be attributed 
in part to the verbal descriptions of plane 
positions in the multiple-choice _re- 


sponses. In the other two forms of the 
test, pictures of planes in different posi- 
tions make up the multiple-choice re- 
sponses. 

In the test, Practical Judgment, the 
substantial loading of .486, is not surpris- 
ing inasmuch as the situational prob- 
lems are entirely verbal in statement, 
as are the multiple-choice solutions. 
Viewed superficially, this test is as highly 
verbal as any other test in the battery, 
in that the description of each problem 
requires three or four sentences of rela- 
tively complex construction and of 
varied vocabulary. 

In the test Mechanical Principles, the 
slight weight of .233 may be accounted 
for, as in some of the other tests, perhaps 
by the tendency of the examinee to “talk 
sub-vocally” or to verbalize as he at- 
tempts to solve the item. Such an in- 
trospective conjecture does seem some- 
what reasonable in that as the examinee 
studies each diagram with its arrows 
and vectors he may project himself into 
the situation and set out to solve the 
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problem through the aid of verbal sym- 
bols. Of course, the verbal statement of 
each item, as well as the directions, may 
account for the slight loading. It may 
be that the West Point group with more 
academic training tended to verbalize 
their actions while matching photographs 
with larger aerial photographs or with 
maps. The directions for these two tests 
are relatively short. 

A somewhat surprising result is the 
tendency for the Negro group (with the 
exception of the two tests upon spatial 
orientation) to place somewhat higher 
than the. West Point and white cadet 
groups in their loadings upon the verbal 
factor in tests other than Reading Com- 
prehension, Just the opposite outcome 
might be expected for the West Point 
Cadets, who in most instances have had 
more experience with verbal material 
through their academic training. An hy- 
pothesis is suggested to rationalize this 
apparent difference; namely, that the 
weights in the verbal factor are a func- 
tion of the level of difficulty of the verbal 
material with respect to each of the pop- 
ulations, Specifically, it appears that the 
level of difficulty is pitched perhaps too 
high for Negroes in the Reading Com- 
prehension test (as evidenced by a load- 
ing of .526 compared with those of .625, 
and .600 for groups I and III respec- 
tively), The level of difficulty appears to 
be more favorable to the Negroes in two 
places: first, in the lengthy directions of 
such a test as Dial and Table Reading 
(as evidenced by a loading of .435 com- 
pared with .237 and .100 for groups I 
and III respectively) and second, in the 
verbal material of the items in tests in- 
tended to be non-verbal, such as Instru- 
ment Comprehension I, Instrument 
Comprehension II, Mathematics A, and 
Mathematics B. In general, the direc- 
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tions for most tests are designed to be 
considerably below an examinee’s level 
of reading comprehension. 

Evidence contrary to the hypothesis is 
suggested by the test General Informa- 
tion, CE505E, with a loading of .130 for 
Negroes, compared with that of .430 for 
the white cadet population. No reason 
can be given for this difference unless it 
be (as seems unlikely) that the loading 
of .528 in the factor of mechanical ex- 
perience has absorbed a disproportionate 
part of the relatively low communality 
of .417. 

Another interesting result, though 
likely an insignificant one, is the slight 
positive contribution of the verbal factor 
to the validities of the two classification 
batteries administered to West Point 
Cadets and Negroes, as evidenced by the 
respective factor weights of .o50 and 
for Negroes. These two positive 
loadings in the pilot criterion stand in 
contrast to that —.o50 for the white 
cadet population. A workable, or plaus- 
ible, hypothesis is lacking to account for 
these differences which may likely be 
only chance fluctuations easily attributed 
to sampling errors. (It is possible, how- 
ever, that verbal instructions may play 
a more dominant role in one pilot train- 
ing program than in another.) 


b. Number factor. One of the easiest 
factors to identify, the number factor, 
also one of Thurstone’s primary abilities, 
occurs in those tasks in which the simple 
fundamental numerical operations of 
arithmetic: addition, subtraction, multi- 
plication, and division are involved. The 
use of numbers, as in the test of Dial 
and Table Reading, seems to give rise 
to the factor. In fact, the mere presence 
of numbers in a test seems to be fre- 
quently a sufficient condition for the 
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appearance of a small loading in a fac- 
tor as indicated by Spatial Orientation I, 
in which a numbered serial photograph 
is matched with one of the lettered 
portions of a larger aerial photograph 
(although the variance in this factor may 
be attributed to the counting of ob- 
jects common to the two photographs in 
the matching of them). 

In the list of tests weighted for the 
number factor (see Table 7), attention 
should be called to the fact that the fac- 
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TABLE 7 


in numerical operations, not included in 
the November, 1943, battery. That the 
presence of these pure tests was needed 
for the identification of a number factor 
in the November battery is evidenced 
in part by the failure of the numerical 
and reasoning components to separate 
in the Negro group. All tests included 
in the November battery evidently ca- 
pable of identifying numerical or reason- 
ing factors were in terms of previous 
analyses, for the white cadet population, 


NUMBER-FACTOR LOADINGS IN DEFINITIVE TESTS AND IN RESPECTIVE PILOT CRITERIA IN 


THE THREE POPULATIONS OF WEsT PoINT, NEGRO, AND WHITE AVIATION CADETS* 


Test and Code Number 


Numerical Operations—Front 


CI702B 


Numerical Operations—Back Cl702B 708 81 
(General) Mathematics A Cl702F 551 51 
Mathematics B (Arithmetic Reasoning) Cl206C 373 628 48 
Dial and Table Reading CP622-21A 587 397 53 
Spatial Orientation I CPs501B 189 166 18 
Reading Comprehension C1614H 098 285 12 
Discrimination Reaction Time CP611D 160 278 18 
Complex Coordination CM7o1A 105 207 05 
Rudder Control CM120B —120 240 —03 
Pilot Criterion —053 —o6o0 fee) 


ing variance. 


tor loadings reported for the Negroes 
(Column II) are probably inflated with 
reasoning variance, since only the verbal 
variance was separated from the “general 
intellectual” factor.* Hence each com- 
parison pf loadings to be made is limited 
in its scope. The most heavily weighted 
tests in the number factor appear in 
Table 7. 

Most illuminating in the identification 
of the number factor are the two tests 


*A possible indication that the factor has ab- 
sorbed most of the number and _ reasoning 


variance is that the square root of the sum of 
the number variance and the reasoning variance 
of tests for the representative white aviation- 
cadet group approximates the factor loadings of 
this bifurcated factor in the tests more heavily 
loaded in it. 


® The loadings for Negro cadets are for a general intellectual factor consisting of number and reason- 


extremely complex with respect to nu- 
merical, reasoning, and verbal compo- 
nents. Hence, as pointed out in the pre- 
vious chapter (III, A, 3) the difficulty of 
separating the verbal factor from the 
other two intellectual factors was evident 
in the factor-analysis procedure. 

In those tests not primarily concerned 
with mathematics, the use or the pres- 
ence of numbers in the items results in 
a loading in the number factor. As in 
a preceding paragraph, the Dial and 
Table Reading Test, with a high weight 
in the number factor for all three groups 
demands the use of numbers in one way 
or another (see the description of the test 
in II, A, 1). Even in the Reading Com- 


= 
Group 
I II III 
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prehension Test, highly saturated in the 
verbal factor, slight weights appear for 
the number factor, apparently because of 
the numerical quantities employed in the 
two reading selections about the Merc- 
ator projection and about the air speed 
meter, and in the items upon these two 
selections. Small loadings in the two 
apparatus tests, Discrimination Reaction 
Time, and Complex Coordination, may 
be attributed perhaps to the counting 
of the number of lights (and to the count- 
ing during the process of testing of the 
number of keys, four in all, in the latter 
test). The appearance of a weight of 
.240 for the test of Rudder Control, ad- 
ministered to Negroes, cannot be ex- 
plained in terms of either a numerical 
or a reasoning component; although by 
a negative argument the reasoning com- 
ponent, perhaps, would be more likely 
to occur than the numerical (since no 
numbers are present in the testing situa- 
tion). 

For the West Point Cadets and white 
cadets, the loadings in the number fac- 
tor stand in exceedingly close agreement 
for all tests. When allowance is made 
for the presence of a reasoning com- 
ponent as well as a numerical component 
in the loadings listed for Negroes, the 
overall agreement is close. 

With respect to the pilot criterion the 
contribution of the number factor is 
zero, or slightly less than zero, for all 
three groups. A part of the negative load- 
ing for the Negro group is probably 
made up of reasoning variance which, 
as will be seen subsequently, is loaded 
about zero in the pilot criterion for the 
other two populations. 


c. Reasoning factor, Previous analyses 
by psychological research units of the 
AAF revealed three reasoning factors 
with the degree of correlation among 
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them assumed to be zero, although possi- 
bilities of absorption are numerous. 
Reasoning I has been called “general 
reasoning.” The other two forms of rea- 
soning are even more obscure. Highest 
loading for reasoning II appeared in tests 
in which comprehension of analogies 
is apparently required. An interpretation 
of reasoning III will not be undertaken. 
That the factorial composition of reason- 
ing has not been too clearly defined by 
factor analysts is further indicated by 
the lack of agreement of the AAF results 
with Thurstone’s distinction between de- 
ductive reasoning and inductive reason- 
ing. 

The reasoning factor appearing for 
the West Point Cadets seems to resemble 
that of general reasoning in the white 
cadet population. The factor weights in 
the reasoning factor for the two groups 
stand in relatively close agreement. Of 
course, direct comparisons of the load- 
ings in the reasoning factor of these 
two groups with those of the bifurcated 
(reasoning-number) factor of the Negroes 
are impossible. If some of the number 
variance can be discounted, the agree- 
ment is fairly close with the possible 
exception of the test of Mechanical Prin- 
ciples, Clgo3B. The factor weights of 
the tests most heavily loaded in the rea- 
soning factor for the three groups are 
presented in Table 8. 

The appearance of the factor in such 
tests as Mathematics B, suggests that the 
factor may be interpreted as an ex- 
aminee’s ability to relate the essential 
properties, characteristics, or require- 
ments of a problem into that unique 
combination or pattern of steps necessary 
for its solution. In common sense terms, 
it resembles that form of scholastic apti- 
tude required for successful progress in 
the solution of word problems encoun- 
tered in intermediate algebra. It may 
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even be another term for insight. 

Comparison of the factor weights re- 
veals several slight differences for the 
groups with respect to the tests: Me- 
chanical Principles (both forms); Read- 
ing Comprehension, and Mechanical In- 
formation. The somewhat higher load- 
ings of these three tests for the West 
Point group stand in essential agree- 
ment with the higher weight for the 
reasoning factor in the pilot criterion. 
The inference to be made is that the 
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of selectivity. In the test of Mechanical 
Principles, and to a considerably lesser 
degree in the test of Mechanical Informa- 
tion, the West Point Cadets who, as a 
group, throughout childhood and ado- 
lescence may have had less interest in 
mechanical gadgets and a greater interest 
in symbolic materials, may be able to 
compensate for their lack of mechanical 
experience by reasoning out the prob- 
lems presented them. For example, an 
examinee whose interest and experience 


REASONING-FACTOR LOADINGS IN DEFINITIVE TESTS AND IN RESPECTIVE PiLor CRITERIA IN 


THE THREE PoPpuULATIONS OF WEsT Pornt, NEGRO, AND WHITE AVIATION-CADETS* 


Test and Code Number 


Group 
II 


I 


Mathematics B (Arithmetic Reasoning) CI206C 513 628 47 
Mechanical Principles CI903A o81 34 
Mechanical Principles Cl903B 404 34 
Reading Comprehension Cl614H 305 285 19 
Mechanical Information Cloo0s5B 219 oI 
— Orientation II CP503B 181 o81 17 

umerical Operations— Back CI702B 217 II 
Mathematics A ClI702F 551 24 
Dial and Table Reading CP622-21A 076 397 16 
Discrimination Reaction Time CP611D 200 278 II 
Pilot Criterion O75 fore) 


ing variance. 


West Point Cadets, a highly selected 
group with respect to academic achieve- 
ment, tend to use reasoning both in 
answering test items and in pilot train- 
ing to a greater extent than do indi- 
duals of the other two groups. Such a 
tendency appears reasonable. Inasmuch 
as West Point Cadets represent a group 
of students of exceptionally high scholas- 
tic achievement, and inasmuch as the 
factor of reasoning is probably only sec- 
ond in importance to the verbal factor 
in successful college-preparatory and col- 
lege work, its appearance in more sub- 
stantial amounts than 


groups, is indicative of a circumstance 


* The loadings for Negro cadets are for a general intellectual factor consisting of number and reason- 


in the other 


with mechanical devices may be ex- 
tremely slight, can solve from his know!- 
edge of the principles of high school 
physics (or mechanics) many items in the 
test of Mechanical Principles. In pilot 
training, an analogous set of circum- 
stances may be present. 

The greater amount of loading in the 
reasoning factor in the test of Discrimina- 
tion Reaction Time, may again merely 
represent the facility of West Point Ca- 
dets to employ reasoning in unfamiliar 
tasks. The largest proportion of the rea- 
soning variance, which is only .o40, may 
be apparent at the earlier stages of learn- 
ing for the West Point Group and not 
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in the later series of trials. In other 
words, the contribution of reasoning to 
the variance may be so substantial in 
early trials that a loading as high as .200 
must appear. Unfortunately, no compari- 
sons may be made with the Negro group, 
since the number variances cannot be 
partialled out of the loading .278. 

As mentioned in a preceding para- 
graph, the validity of the reasoning fac- 
tor appears to exist in a small degree 
for the West Point Cadets, whereas it 
does not for the other groups. In view 
of the presence of sampling errors, the 
loadings in the reasoning factor in the 
criterion, all absolutely less than .075, 
should be interpreted with caution. It 
may be concluded, however, that reason- 
ing variance along with that of the other 
intellectual factors does not contribute 
materially to the validity of the test 
battery for pilot populations. 


2. The Factors of Perceptual Speed and 
Spatial Relations 


a. Perceptual speed. Another one of 
Thurstone’s primary mental abilities, 
the factor of perceptual speed, has been 
investigated further by Thurstone. (6) In 
his exhaustive studies of perception not 
only perceptual speed reappeared, but 
also several other non-orthogonal fac- 
tors (sub-factors of the super-factor of 
perception) among which were “the abil- 
ity to form a perceptual closure against 
some distraction,” (6, p. 101), a common 
factor for optical illusions, a factor iden- 
tified to represent reaction time, a factor 
involving alternation effects (such as 
their rates in ambiguous figures), a fac- 
tor “concerned with the manipulation of 
two configurations simultaneously or in 
succession . . .,” (6, pp. 110-111), the re- 
appearance of the factor of perceptual 
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speed, and several other less clearly de- 
fined factors. 

The factor of perception revealed in 
the two factor analyses of the November, 
1943, battery and the September, 1944, 
battery has been identified as perceptual 
speed, although one or more of the non- 
orthogonal factors uncovered by Thur- 
stone may be present. Guilford (g) and 
Guilford and Zimmerman (11) have 
pointed out that the factor of perceptual 
speed is in evidence when rapid com- 
parisons of small, detailed, or complex 
visual figures and visual forms are made 
and when accurate discriminations of the 
similarities and differences in them are 
required. The factor of perceptual speed 
is likely to be in evidence “when in a 
multiple-choice response the correct fig- 
ure is hard to discriminate from among 
its distractors, the rule or principle hav- 
ing been easy to apprehend. . . .” (g, 
P. 399). 

The descriptions of the three tests: 
Spatial Orientation I, Spatial Orienta- 
tion II (see II, C, 1, a) and especially 
Speed of Identification (see Il, C, 2), 
would lead one to believe that the per- 
ceptual speed factor should be heavily 


loaded in these tests. The expectation 


that the three tests should be highly 
saturated in the perceptual speed factor 
is supported by the high factor loadings 
in the three tests. The weights of these 
three tests in the factor identified as per- 
ceptual speed, along with those in eleven 
other tests, are listed in Table 9. 

The high loadings are comparable, test 
by test, for the three groups. In the test 
Dial and Table Reading, the comparable 
loadings appear to be due to the com- 
mon need for all examinees to pick out 
quickly the relevant number among a 
conglomeration of others in one table, 
and to select in other tables from sev- 
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detailed, complex, and similar- 
appearing numerical readings the correct 
one. 

For the three tests of instrument com- 
prehension a certain discrimination in 
the selection of correct figures (or of the 
correct verbal description in Jnstrument 
Comprehension I, C1615B) correspond- 
ing to instrumental readings appears to 
be necessary. This perceptual operation, 
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TABLE 9 


presence of a loading in kinesthesis in 
pencil-and-paper tests may represent a 
projection of the kinesthetic factor— 
perhaps a factor of empathy. 

In the test of Discrimination Reaction 
Time, slight loadings, of approximately 
the same size for the three groups, are 
present in the perceptual speed factor. 
These factor-weights may actually be re- 
lated to the factor of perception, reac- 


PERCEPTUAL-FACTOR LOADINGS IN DEFINITIVE TESTS AND IN RESPECTIVE PILoT CRITERIA 


IN THE THREE POPULATIONS OF WEST PorINtT, NEGRO, AND WHITE AVIATION-CADETS 


Test and Code Number 


Group 
II 


I Ill 


Speed of Identification CP610A 
Spatial Orientation I CPs501B 
Spatial Orientation II CP503B 
Dial and Table Reading CP622-21A 
Instrument Comprehension C1616C 
Instrument Comprehension I Cl615B 
Instrument Comprehension II CI616B 
Discrimination Reaction Time CP611D 
Finger Dexterity CP116A 
Complex Coordination CM7o1A 
Biographical Data—Navigator CE602D 
Biographical Data—Pilot CE602D 


Pilot Criterion 


627 64 
627 519 62 
549 551 54 
262 208 31 
230 20 
144 18 

374 17 

231 260 22 
188 130 20 
227 —054 20 
086 234 10 
173 O23 14 
—040 260 15 


however, does not seem to be the major 
function required for successful per- 
formance upon this test. In general, the 
higher loadings in these tests are for the 
spatial-relations factor. An exception to 
this general result occurs in the instance 
of a weight of .374 in the perceptual 
factor (and a loading of .369 in spatial- 
relations factor) for Instrument Compre- 
hension II, C1l616B, administered to 
Negroes. A possible hypothesis for this 
occurrence is that the factor to be iden- 
tified as kinesthesis with a loading of 
.238 minimizes in part the réle played 
by the spatial-relations factor, and that 
the function of perception (in combina- 
tion with kinesthesis) is maximized. As 
will be pointed out subsequently, the 


tion time, which Thurstone identified. 
Inasmuch as it is necessary for the ex- 
aminee to make a rapid sensory dis- 
crimination among lights spaced alike 
from one test trial to the next in their 
arrangement on the panel, but hetero- 
geneous with respect to color (white, red 
and green), and inasmuch as it is neces- 
sary for the examinee to select one among 
four switches identical in appearance, the 
presence of a perceptual speed factor 
should be expected. However, other fac- 
tors are probably apparent such as spa- 
tial-relations and psychomotor coordina- 
tion. It would seem that Thurstone’s 
reaction time designation is somewhat 
too inclusive a term. 

Although the factors of spatial-rela- 
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tions and psychomotor coordination are 
to be considered individually, mention 
of them at this point is appropriate 
with respect to the test of Discrimination 
Reaction Time. Moreover, it will be 
helpful at this point to distinguish, if 
possible, between the factors of percep- 
tual speed, spatial relations, and visuali- 
zation, inasmuch as the test of Discrimi- 
nation Reaction Time illustrates rather 
well the differences between the psycho- 
logical interpretations of these factors. 
Classification of these terms at this point 
will tend to minimize to an extent, pos- 
sible confusion in subsequent sections. 

The decision as to whether to throw 
the left-hand or right-hand switch (in his 
motor choice) is thought to describe the 
examinee’s ability to perceive the spatial 
order among the switches. Hence, the ap- 
praisal of relationships between the 
switches in their spatial arrangement in 
this test may be taken as an interpreta- 
tive definition of the construct spatial 
relations. Prior to the choice decision 
(as to which switch to throw), a discrim- 
ination is made as to the position of the 
red light with respect to the green light 
(left or right, above or below). The per- 
ception of the spatial relationships or 
order among the lights furnishes the 
mental set required for the decision. 
Hence, another feature of the spatial 
relationships is that on the sensory side. 
Of course, a complete distinction be- 
tween the sensory and motor choice is 
impossible, For the three groups of West 
Point cadets, Negroes, and white cadets, 
loadings in the spatial relations factor 
for this test are .256, .400, and .420 re- 
spectively, 

Closely related to space is visualiza- 
tion. The independence of these two 
factors Thurstone has not distinguished. 
Factorial analyses by psychologists of 
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the AAF did appear to separate space 
from visualization and to discredit the 
term spatial-visualization. The sensory 
discrimination in the test of Discrimina- 
tion Reaction Time, might possibly be 
considered visualization, just as the mo- 
tor, or choice, reaction might be inter- 
preted as psychomotor coordination. To 
a degree, such a distinction may be valid. 
However, in the definition rendered by 
Guilford and Zimmerman, a new posi- 
tion of the stimulus objects must result 
following (its mental) manipulation, ro- 
tation, rearrangement, or inversion: 
The tests most heavily saturated with it all 
seem to involve a visual manipulative ability. 
In solving the problems it is necessary mentally 
to move, turn, twist, or rotate an object or 
objects and to recognize a new appearance or 


position after the prescribed manipulation has 
been performed. (11, p. 157) 


In the instance of the test of Dis- 
crimination Reaction Time, the role of 
visualization may exist, but the stimuli 
do not appear to be in the need of 
manipulation suggested by the defini- 
tion. Examples of visualization tests 
would include those of counting blocks, 
and of mechanical principles (illustrated 
by a test of the Bennett type). 

Zimmerman‘ has suggested that a test 
may be either spatial or visual for an 
examinee, depending upon the activities 
involved. If, in a test of flags the subject 
is able, so to speak, to pick up the flag, 
move it, turn it about as if he actually 
had a model in his hands, then visualiza- 
tion is dominant. On the other hand, if 
the examinee has to move himself to 
a different position, as in cocking his 
head to one side or “standing upon his 
head,” then a spatial factor is involved. 

For neither of the two test batteries 


‘In a personal communication to the writer, 
Zimmerman has proposed this hypothesis. 
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did the visualization factor emerge in 
the analyses. It is possible that a con- 
siderable portion of its variance was ab- 
sorbed by the factor identified as space 
—especially in the instance of the test, 
Mechanical Principles, Clgo3B. For the 
test of Discrimination Reaction Time, 
the loading in the visualization factor 
reported in (24) is .200. 

The presence of the perceptual speed 
factor in small amounts in the two tests, 
Complex Coordination, and Finger Dex- 
terity, may be explained in terms of 
discriminations made among the detailed 
stimuli in the respective patterns in the 
visual field. In the former test the dis- 
crimination between the two different 
colors of light may account in part for 
the small loading for the West Point 
Cadets and white cadets. (Distinctions in 
the positions of the lights, of course, 
would be more nearly related to the spa- 
tial factor.) The negative loading for 
Negroes cannot be satisfactorily ration- 
alized, especially in view of the fact that 
the perceptual factor is slightly more 
dominant for this group in the other 
tests than it is for either of the other 
two groups. In the latter test, during the 
rapid fitting of pegs into the holes, the 
perception of holes in the board in rela- 
tion to the background about them, may 
account for the variance in this factor. 
On the other hand, speed rather than 
perception may be what the factor weight 
represents. For the three groups, the 
loadings are comparable. 

Loadings in the test Biographical Data, 
are with the exception of the Negro 
group in the navigator form, insignif- 
icant. In the test the responses to several 
items are repetitious in their wording, 
with only one word being different in 
each of several responses consisting of 
many words. Under the conditions of 
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testing, the selection of the appropriate 
answer among so many similar appear- 
ing clusters or groups of items may ac- 
count for the small loading in perceptual 
speed. A more likely explanation may 
be that the scoring weights employed 
artificially induced the appearance of 
this factor for Negro pilots. 

Whereas the factor weights in _per- 
ceptual speed in the tests for the three 
groups are roughly the same, consider- 
able differences exist in the pilot cri- 
terion. Next to the factor called kinesthe- 
sis that of perceptual speed is the most 
valid one for Negroes with its loading 
of .260 in the pilot criterion. For the 
West Point Cadets the factor weight 
actually is negative (—.o40). Between 
these two extremes falls the loading of 
.150 for the white cadet population. In- 
asmuch as the loadings in the tests do 
not differ noticeably, the only place for 
speculation as to what could account for 
the difference is in the pilot-training ac- 
tivity itself. It is possible that the types 
of teaching methods employed at the 
various training units might account, to 
a degree, for the discrepancies. Similari- 
ties in the perceptual factor weights of 
the tests would argue against differences 
in environmental influences of a_per- 
ceptual nature, as sampled by the test 
items. It is possible that previous ex- 
periences in perceptual activities not 
covered by the test items may be im- 
portant in pilot training. 

Some light may be shed upon the 
problem in terms of the loading in the 
pilot criterion in the spatial relations 
factor. In this instance the rdles of 
Negroes and West Point Cadets are re- 
versed from what they are in the per- 
ceptual factor. For West Point Cadets, 
Negroes, and white cadets the factor 
weights are .415, .190, and .g20 respec- 


ic 


| 
| 
| 


tively, Although it is possible that vari- 
ances may have been absorbed from one 
factor to another, or that any variances 
associated with visualization may have 
been distributed by chance in a manner 
to accentuate the differences in vari- 
ances between the factors of perceptual 
speed and spatial relations, a reasonable 
conclusion is that the spatial relations 
factor actually is substantially valid for 
West Point Cadets and perhaps neg- 
ligibly valid for Negroes and that the 
perceptual-speed factor actually is slight- 
ly valid for Negroes and perhaps neg- 
ligibly invalid for West Point Cadets. 


b. Spatial relations. In the previous 
section (I, A, 2, a) concerning the factor 
of perceptual speed, an attempt was 
made to distinguish between space and 
visualization. During the course of the 
war, AAF psychologists tentatively iden- 
tified three factors of space as well as the 
factor of visualization. Only the first 
space factor, spatial relations, has clearly 
appeared in most analyses. An hypothesis 
that kinesthetic imagery is present in 
some tests has been proposed for one of 
the space factors (11). At a subsequent 
point support for this hypothesis will be 
indicated. Guilford and Zimmerman 
have defined the factor and have in- 
dicated its importance as follows: 

Space I seems to be an ability to perceive the 
spatial order or the relationships among objects. 
In several psychomotor tests, the decision of 
the examinee as to which way to move—right or 
left, up or down, forward or backward—de- 
pended on a correct appraisal of the stimulus 
arrangement. This ability probably outweighed 
all others in the pilot criterion. It may be a 
prominent requirement in any machine-operat- 
ing job that requires decisions as to direction 
of movement dependent on signals. (11, p. 157) 

The consistent appearance of the spa- 
tial relations factor in both pencil-and- 
paper tests and psychomotor tests in 
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previous analyses is an important dis- 
covery with many practical implications. 

High loadings in the spatial relations 
factor for two groups of West Point 
Cadets and Negroes are in essential 
agreement with those of the representa- 
tive white cadet population derived from 
previous analyses. In the accompanying 
Table 10 in which loadings in the spa- 
tial-relations factor in sixteen tests are 
given, a reasonable degree of similarity 
of the weights is present for the threc 
populations. 

Moderate to high loadings in the spa- 
tial relations factor for the test of Com- 
plex Coordination, are to be expected 
from the nature of the task (see descrip- 
tion of the test in II, C, 1, b). The pres- 
ence of a spatial relations factor is clearly 
indicated by the decisions required by 
the examinee as to whether to move the 
stick forward or backward, left or right, 
in conformity to the pattern of the two 
green lights in their (spatial) relation to 
the two red lights and as to how to move 
the rudder control into the required 
position indicated by the relation of a 
third green light to a third red light. 
Of course, the presence of some weight 
in a psychomotor coordination factor 
should be expected in that movements 
of the arms and hands effect the adjust- 
ment to the visual cues. In fact, a load- 
ing of .473 for West Point Cadets and 
a loading of .495 for Negroes did result 
in the psychomotor factor. 

Another apparatus test which aids ma- 
terially in the identification of the fac- 
tor of space is Two-Hand Coordination. 
Like the test of Complex Coordination, 
it is loaded about as much in psycho- 
motor coordination as in space for all 
three groups. A review of the descrip- 
tion of this Two-Hand Coordination test 
(see II, C, 1, b) should indicate that the 
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decision to use either the right hand or 
left hand in either right or left move- 
ments is a choice reaction which should 
be represented by the spatial relations 
factor. That some reasoning variance 
might be expected in the early stages of 
learning is indicated by small loadings 
of .123 for West Point Cadets and of 
.156 for Negroes (which of course may 
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What are some possible reasons to ac- 
count for loading of only .256 for West 
Point Cadets? First of all, in view of the 
low communality of .336, a weight of 
.351 for the West Point group in the 
psychomotor coordination factor may 
account in part for the lower loading in 
the spatial relations factor (despite the 
apparent validity of this factor indicated 


TABLE 10 


SPATIAL-RELATIONS Factor LOADINGS IN DEFINITIVE TESTS AND IN RESPECTIVE PILOT CRITERIA 
IN THE THREE POPULATIONS OF WEsT PorINtT, NEGRO, AND WHITE AVIATION-CADETS 


Test and Code Number 


Group 
II Ill 


Complex Coordination 
Two-Hand Coordination 
Discrimination Reaction Time 
Instrument Comprehension 
Instrument Comprehension I 
Instrument Comprehension II 
Dial and Table Reading 
Mechanical Principles 
Mechanical Principles 

Spatial Orientation I 

Spatial Orientation II 
General Information 

General Information 
Mechanical Information 
Rudder Control 

Rotary Pursuit 

Finger Dexterity 

Pilot Criterion 


CM7o01A 
CMrior1A 
CP611D 
C1616C 
Cl615B 
C1616B 
CP622-21A 
CIg903A 
Cl903B 
CPs501B 
CP503B 
CEsosk 
CEs505F 
Cloo5B 
CM120B 
CP410B 
CP116A 


49 
425 41 
400 42 
41 
44 
369 53 
487 42 
12 
I2 
280 10 
219 16 
23 
10 
o2 
13 
14 
12 
32 


include number variance), although for 
the white cadet population, a loading 
of zero exists. 

A third apparatus test, the factorial 
complexity of which with respect to per- 
ceptual, spatial, and psychomotor fac- 
tors was revealed in several previous 
analyses (24), is Discrimination Reaction 
Time, loaded .400 and .420 in the spa- 
tial factor for Negroes and white cadets 
respectively. It is weighted only .256 
for West Point Cadets. This loading dif- 
fers sufficiently from that for the two 
other populations that an explanation 
is required, 


by the presence of a substantial weight 
of .435 in the pilot criterion), That the 
residual factor with a loading .130 in 
this test may represent a portion of the 
variance which has split off from the spa- 
tial factor is one possible explanation. 
If another factor (not representing mere 
error variance) could be extracted in 
order to permit further rotation, the 
residual factor might emerge as a second 
space factor, or even as a visualization 
factor. Of course, it may be true that the 
factor of psychomotor coordination for 
West Point Cadets actually assumes a 
more important rdle than the space fac- 
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tor. Evidence for the dominance of the 
psychomotor factor over the others in 
apparatus tests administered to West 
Point Cadets is available in the presence 
of its loadings in five out of the six of 
these tests higher than those for the 
white cadet population. 

Finally, practice effects in the train- 
ing received with the four preceding 
psychomotor tests of Rotary Pursuit, 
Two-Hand Coordination, Complex Co- 
ordination, and Rudder Control, may 
have rendered psychomotor coordination 
to be more important and spatial rela- 
tions less significant in performance of 
the tests. In other words, the relative 
contribution of the visual cues to the 
execution of a quick motor response is 
far less important than the coordination 
involved. Stated in still another manner, 
at their stage of learning, the West 
Point Cadets, having obtained maximum 
proficiency in use of visual cues, per- 
formed successfully in  reaction-time 
situation primarily to the extent that 
they could effect a motor response 
smoothly. 

In the three other apparatus tests, sev- 
eral differences are present in the load- 
ings in the space iactor. Among the 
most marked dissimilarities in factor 
weights are those in the Rudder Control 
Test: loadings of .496 for West Point 
Cadets, of .ogo for Negroes, and of .130 
for white cadets. Since most of the com- 
mon-factor variance for the Negro group 
is accounted for by the saturation of the 
test in the factor identified as kinesthe- 
sis, which may well be the name for a 
second space factor, the loading of .ogo0 
in the spatial relations factor is not too 
surprising. Moreover, the weight in the 
psychomotor factor is moderate (.311). 
Whether a second space factor might be 
brought out for West Point Cadets, 
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through the type of rotation suggested 
in the next-to-the-last paragraph is prob- 
lematical. It does seem reasonable to be- 
lieve that the loading of .496 embraces 
more than spatial-relations variance, es- 
pecially when the weight is compared 
with that of .130 for white cadets. That 
the field of visual cues in the rudder 
control task is less restricted than those 
fields of visual cues in tests of reaction 
time and coordination might account for 
the negligible weight of .130 in the white 
cadet population. 

Somewhat the same set of circum- 
stances as appeared in the Rudder Con- 
trol test is present on a smaller scale in 
the test Rotary Pursuit. For the two 
groups of Negroes and West Point Ca- 
dets, the factor weights of .259 and .og5 
are proportional respectively to those 
obtained in the test of Rudder Control. 
Another parallel is the slight weight of 
.186 in the kinesthetic factor (for test 
of Rudder Control). The sweeping cir- 
cular movements of the arms may ac- 
count in part for this small loading. 
Primarily the test is one of eye-hand co- 
ordination—a fact well substantiated by 
the high loadings in the psychomotor- 
coordination factor for all three popula- 
tions. What variance in the spatial-rela- 
tions factor does appear may likely be 
attributed to the decision as to which 
key to depress in the distraction task. 

In the sixth psychomotor test Finger 
Dexterity, the pattern of loadings in the 
spatial-relations factor .069, .273, and 
.120 for the three groups (West Point 
Cadets, Negroes, and white cadets re- 
spectively) resembles that of .256, .400, 
and .420 for the test Discrimination Re- 
action Time. Inasmuch as the Finger 
Dexterity test is the last to be adminis- 
tered, the same hypothesis of practice 
effects suggested to explain the low load- 
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ing of .256 in the test of Discrimination 
Reaction Time, may account for the 
weight of .o6g for West Point Cadets. 
At the particular level of learning of 
the West Point group, improvement in 
performance may depend almost entirely 
upon the physiological limits set by 
neuro-muscular mechanism. Further evi- 
dence that primarily a psychomotor func- 
tion is involved is the relatively high 
degree of purity of the test of Finger 
Dexterity. 

Among the numerous pencil-and-paper 
tests loaded in the spatial-relations fac- 
tor, the most definitive are the three 
forms of the test of instrument compre- 
hension. With the exception of the load- 
ing of .369 for the Negro group in /n- 
strument Comprehension II, the factor 
weights are comparable among all forms 
for the three populations. 

That the spatial-relations factor can 
be as well represented in pencil-and- 
paper tests as in apparatus tests is highly 
significunt. If, for example, success in a 
given vocational task demands primarily 
a spatial-relations factor, the need for 
costly and time-consuming apparatus 
tests is minimized. Through symbolic 
representation of the possible positions 
of planes corresponding to instrument 
readings, the examinee is forced to make 
a decision as to whether the plane has 
banked left or right and has moved up 
or down (has climbed or has descended). 
Although these tests might be expected 
to emphasize intellectual factors, their 
relatively high degree of purity with re- 
spect to the spatial relations factor indi- 
cates that they successfully measure with 
symbolic material one of the same factors 
as do the psychomotor tests (without the 
presence of large amounts of other vari- 
ance found in the apparatus tests). 

In the factorially complex test, Dial 
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and Table Reading, the moderate and 
substantial loadings obtained in the spa- 
tial relations factor for the three groups 
are somewhat difficult to rationalize. To 
a small degree, visualization variance 
may be present, although the format of 
the dials would not seem to require a 
considerable amount of the so-called 
mental-manipulation or rotation of the 
perceived objects. To some extent the 
interpolation process required in read- 
ing the dials may account for the load- 
ings, inasmuch as the position of the 
needle stands in spatial relation to the 
numbers on the dial to its left and to its 
right. In the decision as to the reading 
represented, an oscillation in the per- 
ception between the numbers at the left 
and at the right, or above and below, 
is necessary. Similarly the part of the 
test upon table reading requires both 
“right-left” and “top-bottom” points of 
orientation to guide in the selection of 
the relevant number (from the conglom- 
eration of others). 

That the factorial composition of the 
test may be a function of the intellectual 
factors entering into scholastic aptitude 
is indicated by the lower loading (.307) 
in spatial relations factor for the West 
Point Cadets. Higher weights (.587 and 
.237) in the numerical and verbal fac- 
tors than those of .530 and .100 for white 
cadets may reveal that the importance 
of the spatial relations factor in the test 
is reduced; first, because of the previous 
amount of experience of the West Point 
Cadets with tables and quantitative data 
encountered in high-school and college 
mathematics-and-science courses and, sec- 
ond, because of the tendency of aca- 
demically trained people to verbalize. 
(The verbal loading for Negroes of .451 
was rationalized in terms of the require- 
ment of a reading level of relatively low 
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difficulty in the lengthy directions.) 

For the test of Mechanical Principles, 
forms ClgogA and ClIgog3B, dissimilari- 
ties in the weights (.358, .o61, and .200 
for three groups) furnish an opportunity 
for several hypotheses. The factor weight 
of .358 for West Point Cadets is further 
evidence of the possible complexity of 
the space factor extracted for this group. 
Inasmuch as previous analyses have re- 
vealed a loading as high as .54 in the 
visualization factor for this test, and in- 
asmuch as visualization might be ex- 
pected to play an important réle in any 
science or engineering curriculum, the 
inflation of the loading in the spatial 
relations factor (through the presence of 
visualization variance) is a reasonable 
occurrence, For the Negro group, a neg- 
ligible weight of .165 in kinesthesis 
furnishes a possible clue to the presence 
of another source of variance in the 
spatial factor for the West Point group. 
For the Negro group, the appearance 
of the kinesthetic factoy may account in 
part for the small loading of .o61 in the 
spatial relations factor. What little vari- 
ance does appear in the space factor for 
the test of Mechanical Principles, Clgo3B 
is highly suggestive that the hypothesis 
proposed by Guilford and Zimmerman 
for interpretation of a spatial relations 
factor is a meaningful one, inasmuch as 
a decision of left-right, up-down, or for- 
ward-backward movement is lacking for 
most items. Considerable rotation, ma- 
nipulation, and inversion of the visual 
images seem to be demanded. 

In the two tests, Spatial Orientation I 
and Spatial Orientation II, comparable 
factor weights in spatial relations are 
present for the West Point Cadets and 
Negroes. Loadings in the vicinity of .200 
are somewhat lower than what the “arm- 
chair” estimates of a person looking at 


TESTS AND CRITERIA , 37 


this test might be. Primarily, the test is 
perceptual in line with the previously 
mentioned hypotheses set up by Guilford 
(9, p. 390) for the appearance of a per- 
ceptual speed factor. Study of the test 
does indicate that a decision of the di- 
rections of movement is lacking, although 
the stimuli in the aerial photographs and 
maps do stand in spatial relation to one 
another. It is quite possible that if most 
of the intricate detail were eliminated 
and if only a few clearly indicated land- 
marks (symbols) stood in relationship to 
one another, the factorial composition 
would be considerably altered in the di- 
rection of larger loadings in the spatial 
relations factor. 

In the three tests of information, Gen- 
eral Information, CE505E, General In- 
formation, CE505F, and Mechanical In- 
formation, Clgos5B, negligible and slight 
loadings are present in the spatial rela- 
tions factor. In these tests, the factor may 
function vicariously in the verbal items. 
In other words, some of the responses to 
items may depend to a slight degree 
upon the examinee’s reenacting so to 
speak, an experience suggested by the 
item. An example of such a possibility 
is furnished by the following item taken 
from the General Information Test, 
CE505F: 

In making a turn to the left, pressure is 
applied: 

A. first with the rudder and then with the stick, 
B. first with the stick and then with the rudder, 
C. with the stick and rudder at the same time, 
D. with the rudder alone, 

E. don’t know. 

The validity of the spatial relations 
factor is indicated by the loadings of 
.415, .190 and .320 in the pilot criterion 
for the three groups of West Point Ca- 
dets, Negroes, and white cadets. That 
components of visualization, and even 
kinesthesis, may be present in the load- 
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ing of .415 for West Point Cadets is 
possible. The somewhat lower weight 
for the Negroes may be attributed to the 
loading of .376 in the kinesthesis fac- 
tor. Experience of AAF research psy- 
chologists tended to indicate that the 
spatial relations factor is one of the most 
valid factors, if not the most valid factor, 
for prediction of pilot success. The spa- 
tial-factor loadings in the pilot crite- 
rion for the West Point Cadets and 
Negroes are consistent with the results 
of previous analyses. 


3. The Factors of Mechanical Ex perience 
and Pilot Interest 


In tests of the informational and _ bio- 
graphical type, the factors of mechanical 
experience and pilot interest are well- 
defined. For convenience, these two fac- 
tors are considered together, inasmuch 
as several tests are complex with respect 


to them. 


a. Mechanical experience factor. In 
previous analyses the factor of mechani- 
cal experience has appeared repeatedly 
in tests consisting of mechanical con- 


tent. The examinee’s familiarity with 
automobile parts, shop tools, machinery, 
and miscellaneous gadgets apparently 
accounts for high loadings in tests made 
up of such informational items. Al- 
though most clearly defined for the three 
populations in the six tests: Mechanical 
Information, Clgo5B, Mechanical Prin- 
ciples, ClgogA, Mechanical Principles, 
Clg03B, General Information, CE505E, 
General Information, CE505F, and 
Biographical Data—Pilot, CE6o02D, the 
factor of mechanical experience is pres- 
ent to a small degree in other pencil- 
and-paper tests, in one psychomotor test, 
and to a doubtful degree in a second 
psychomotor test, as indicated by the 
loadings for each of the three popula- 
tions in Table 11. 

For all three groups, the factor weights 
in the first six tests in Table 11 stand in 
close agreement with two exceptions. The 
high loading of .580 for the West Point 
Cadets in the test General Information, 
CE505F, is due to the existence of a spuri- 
ous correlation of this form with another 
test, Mechanical Information, ClgosB, 
with which it has several items in com- 


TABLE 11 


MECHANICAL-EXPERIJENCE FACTOR LOADINGS IN DEFINITIVE TESTS AND IN RESPECTIVE PILOT CRI- 
TERIA IN THE THREE POPULATIONS OF West Pornt, NEGRO, AND WHITE AVIATION-CADETS 


Test and Code Number 


Group 
Il 


Mechanical Information 
Mechanical Principles 
Mechanical Principles 
General Information 
General Information 
Biographical Data—Pilot 
Instrument Comprehension II 
Instrument Comprehension 
Two-Hand Coordination 
Spatial Orientation II 
Practical Judgment 
Reading Comprehension 
Rudder Control 

Pilot Criterion 


Clo05B 
CI903A 
CI903B 
CEsosF 
CEsos5E 
CE602D 
C1616B 
C1616C 
CMro1A 
CP503B 
CI301C 
Cl614H 
CM120B 


I III 
699 64 
543 58 
550 58 
580 34 
528 53 $ 
441 264 50 i 
372 03 
105 14 
328 160 40 x 
223 325 15 
298 12 
261 313 04 
218 146 ol 
040 130 27 


FACTOR ANALYSES OF 


mon. As for the small loading of .264 in 
Biographical Data—Pilot, administered 
to Negroes, an hypothesis is difficult to 
formulate. One possibility to account for 
this fact is that the scoring key placed, 
for Negroes, a disproportionate amount 
of emphasis upon items concerned with 
scholastic interests, hobbies, and other 
activities somewhat more common to 
the culture of the West Point Cadets and 
white cadets than to the culture of the 
Negroes. Many items given a_ positive 
weight are concerned with curricular 
and extracurricular activities of the 
school in which many a Negro may never 
have had an opportunity to participate. 

Small positive loadings in the me- 
chanical experience factor for other pen- 
cil-and-paper tests, with the exception of 
Spatial Orientation II, may be accounted 
for by the presence of items. of mechani- 
cal content. In the test of Reading Com- 
prehension, for example, one selection 
is concerned with principles of forces 
in mechanics; whereas another is built 
about a familiar gadget, the compass. 

For the test of Spatial Orientation IT, 
some of the variance may represent that 
of another sort of background factor— 
possibly experience with maps, figures, 
drawings, blueprints, and so forth. It 
might be expected, however, that the 
loading (.325) for Negroes should be 
somewhat lower than those loadings 
(.223 and .150) for West Point Cadets 
and white cadets, in view of the prob- 
ably greater emphasis in map-reading 
and drawing for the latter two groups in 
their formal education. No reasonably 
satisfactory hypothesis for the differences 
is apparent. 

Loadings in the two psychomotor tests, 
Two-Hand Coordination, and Rudder 
Control, may be explained to an extent 
in terms of a probable positive transfer 


TESTS AND CRITERIA 39 


effect of the examinee’s experience with 
other mechanical devices with which 
the two apparatus tests have several ele- 
ments in common. Higher loadings for 
the West Point group in these two tests 
may possibly be attributed to their con- 
scious attempts at transfer. In their 
physical education activities, apparatus 
resembling these two psychomotor tests 
might have been available for practice 
(although no positive information con- 
cerning the types of training equipment 
at the West Point Academy is attain- 
able). 

That the mechanical-experience factor 
shows approximately a zero loading 
(.040) in the pilot criterion for the West 
Point Cadets is further evidence of the 
likelihood of the strong academic inter- 
ests of the group and of their lack of 
experience with shop tools and machin- 
ery. The mechanical experience factor 
contributes somewhat to the validity of 
the respective test batteries for the Negro 
and white cadet populations. 


b. Pilot interest factor. For both the 
West Point Cadets and Negroes, the test 
of Biographical Data—Pilot, CE6o02D, 
makes a substantial and unique contri- 
bution to the validity of the respective 
test batteries. (For further information 
upon this point, see IV, B, 3.) In the two 
factor analyses of the November, 1943, 
and September, 1944, classification bat- 
teries high loadings (.531 and .562) in 
the test of Biographical Data scored with 
pilot weights appeared in a factor which, 
for the West Point Cadet group, received 
a loading no higher than .352 in any 
other test and which, for the Negro 
group, received a loading of .404 in the 
navigator form of the Biographical Data 
test and a loading no higher than .252 
in any other test. This factor, apparently 
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unique to the test of Biographical Data, 
is tentatively called pilot interest. 
Loaded .436 in the pilot criterion for 
West Point Cadets, it proves to be the 
most valid factor for this group. To 
what extent the range of pilot interest 
in the criterion variable was restricted 
by the presence of only those cadets 
who elected pilot training cannot be 
determined. Therefore the loading of 


The negative loadings in the test, Bio- 
graphical Data, CE6o2D, for the white 
cadet group are unexplainable. It may 
be that the factor identified as mathemat- 
ical background in previous analyses 
may have absorbed most of the variance 
of the pilot-interest factor. A question 
might be raised as to the accuracy of 
identification of the factor of pilot inter- 
est in previous analyses. Since no entries 


TABLE 12 


PiLot-INTEREST FACTOR LOADINGS IN INFINITIVE TESTS AND IN RESPECTIVE PILOT CRITERIA IN THE 
THREE POPULATIONS OF WEsT PoInt, NEGRO, AND WHITE AVIATION-CADETS 


Test and Code Number 


Group 
II Ill 


CE602D 
CE602D 
CEsosE 
CEsosF 
CP410B 
CI616B 
C1616C 


Biographical Data—Pilot 
Biographical Data—Navigator 
General Information 

General Information 

Rotary Pursuit 

Instrument Comprehension 
Instrument Comprehension 
Pilot Criterion 


.436 may be subject to some degree of 
error. 

The loading in the criterion for the 
Negro group is small with respect to 
the pilot interest factor. Apparently the 
factors of kinesthesis, perception, and 
spatial relations for Negroes are sufh- 
cient to account for most of the common- 
factor variance in the pilot criterion. It 
may be that in terms of the intensity of 
motivation the interest factor was pri- 
mary for success of the West Point Ca- 
dets who ordinarily would elect other 
activities. For the Negroes, on the other 
hand, the presence of a moderate degree 
of interest would result in its subordinate 
réle in relation to the other factors for 
pilot success. The extent to which factors 
of temperament may have been signifi- 
cant is undoubtedly important, but ob- 
jective evidence upon this point is lack- 


ing. 
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in this factor are present in the Com- 
posite Factor Analysis Summary for the 
four pencil-and-paper tests included in 
either the November, 1943, battery or 
the September, 1944, battery, it may be 
that tests were not present in those ear- 
lier matrices which could aid in the iden- 
tification of the factor. The small weights 
in the factor of pilot-interest for the 
four non-biographical tests in Table 12 
aid in its identification, inasmuch as 
the items of these pencil-and-paper tests 
suggest an interest in flying. 


4. The Factor of Psychomotor 
Coordination 


Apparently measurable only by ap- 
paratus tests, the factor of psychomotor- 
coordination has proved to be valid for 
pilots, as one might expect. Psychomotor 
tests do possess considerable so-called 
face validity in that they mimic overt 


operations involved in flying. Close 
agreement in three populations among 
the factor weights for the several appara- 
tus tests is apparent in Table 13. Their 
factorial composition is complex for all 
three groups. 

In previous sections concerned with 
intellectual factors and with those of 
perception and spatial relations, hypoth- 
eses have been suggested to account for 


TABLE 13 
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valid for the West Point group and 
white cadet group and not for the 
Negroes? That the Negroes do possess 
the factor in substantial amounts is indi- 
cated by the weights for it in the appara- 
tus tests. One possible hypothesis, open 
to serious question, is that from previous 
training in tasks involving psychomotor 
coordination, the stage or level of attain- 
ment with respect to the factor is so high 


PsyCHOMOTOR-COORDINATION FAcTOR LOADINGS IN DEFINITIVE TESTS AND IN RESPECTIVE PILoT 
CRITERIA IN THE THREE POPULATIONS OF WEsT PoINT, NEGRO, AND WHITE AVIATION-CADETS 


Test and Code Number 


Group 


Complex Coordination 
Rotary Pursuit 
Rudder Control 
Two-Hand Coordination CMio1A 
Finger Dexterit CM116A 
Discrimination Reaction Time CP611D 


CM701A 
CP410B 
CM120B 


Biographical Data—Pilot CE602D 
Mechanical Principles Cloo3A 
Mechanical Principles ClI903B 


Pilot Criterion 


548 497 53 
406 311 48 
404 533 34 
508 454 34 
351 158 12 
075 176 22 

216 18 
094 18 


the loadings of the apparatus tests in 
factors other than psychomotor coor- 
dination. Hence, no further consideration 
of the factorial pattern will be under- 
taken. 

The presence of small factor weights 
in the paper-and-pencil tests of Bio- 
graphical Data—Pilot, CE6o02D, and 
Mechanical Principles, ClgogB, may well 
be a vicarious functioning of the psycho- 
motor factor. Possibly, ideomotor re- 
sponses accompany the reactions of some 
examinees to the content of certain 
items—content which suggests to the 
examinee manipulation of mechanical 
devices. 

In the pilot criterion are noticeable 
dissimilarities for the three groups in 
the loadings in the psychomotor-coor- 
dination factor. Why should this factor be 


that it is relatively unimportant in pilot- 
training in comparison with other fac- 
tors, kinesthesis, for example. In fact, 
the presence of too high a degree of 
psychomotor coordination may tend at 
the earlier stages of flying to result in 
the executions of improper movements 
or, more simply, in the inhibition of 
learning. Moreover, a certain amount 
of “unlearning’”’ of psychomotor re- 
sponses, or perhaps better, relearning, 
may be necessary to effect maximum use 
of kinesthetic cues. 

Part of the difficulty may be in the 
definition of psychomotor coordination. 
Three factors of psychomotor coordina- 
tion appeared in the results of the AAF 
psychological program during the course 
of the War. The first factor is thought to 
involve the somewhat grosser movements 


io- 
ite 
ay 
at- 
Ses 
1ce 2 
on 4 
of 
er- 
ies 
I II Ill 
4 
ym- $$ 
the 
in 
or 
be 
len- 
hts | 
the 
12 
as 
ests 3 : 
ap- 
4 
for 
otor 
lled 
vert 


42 WILLIAM B. MICHAEL 


of the trunk and limbs; whereas the 
second factor is considered to embrace 
the finer, or more delicate, movements 
of hand and wrist. Hence, it is tenta- 
tively identified as psychomotor preci- 
sion. The third factor has been called 
psychomotor speed. If such a differentia- 
tion is possible, it may be that different 
components of the psychomotor factor 
are apparent for the three groups, or 
that in the instance of the Negro group 
too high a degree of psychomotor speed 
results in unfavorable movements. The 
tendency to employ gross movements of 
trunk and limbs may be a_ handicap. 
Only more extensive research efforts in 
this direction can give the answer to the 
disconcerting picture of the pilot-crite- 
rion. 


5. The Factor of Kinesthesis 


For the Negro group only, a factor to 
which reference has been made several 
times previously appeared with a heavy 
loading (.554) in the Rudder Control 
test and with a large weight (.500) in 
the pilot criterion. For the Negro cadets, 
this factor viewed with respect to load- 
ings of other factors in the criterion is 
obviously the most valid. Toward the 
close of the War, what is evidently the 
same factor emerged in another analysis. 
Inasmuch as the loading in psychomotor 
coordination factor is moderate (.311) 
for the Rudder Control test, this factor 
can not readily be interpreted as one of 
psychomotor coordination. Presence of 
slight weights in other tests believed to 
contain variance in space suggests that 
the factor is one of space. 

Selection of the name kinesthesis fits 
reasonably well the description of the 
movements required by the examinee in 
his attempt to return the rudder-control 
device to a straight-ahead position. As 


an examinee participates in the rudder- 
control test, the sense of movement (a 
change of position) is conveyed to the 
higher neural processes from sensory 
stimulation received in muscles, ten- 
dons, and joints. In the psychomotor 
tests, Rotary Pursuit and Two-Hand 
Coordination, these same _ kinesthetic 
cues are probably present to a minor 
degree. 

In the pencil-and-paper test Jnstru- 
ment Comprehension II, the presence of 
a loading of .238 suggests that various 
positions of planes shown in the multi- 
ple-choice responses have taken over in 
part several aspects of the situation en- 
countered in the Rudder Control test. 
This apparently non-voluntary projec- 
tion of the kinesthetic experience of 
the apparatus test into the pencil-and- 
paper test may be called kinesthetic 
empathy. 

Somewhat disconcerning is the weight 
of .225 in the test, Mathematics B. In- 
spection of the items of the test does 
reveal in the word problems the repre- 
sentation of situations in which kinesthet- 
ic cues be involved. Differential veloci- 
ties in the word problems and frequent 
statements of direction may give rise to 
involuntary and largely subliminal neu- 
ral excitations which are conditioned to 
the more active phases of a kinesthetic 
experience. Let it be said, however, that 
the preceding statement, like many 
others, is merely an hypothesis subject to 
modification and possible rejection in 
light of any other evidence which may 
be forthcoming. Never to be minimized 
is the substantial réle which sampling 
errors can play in the magnitude of small 
factor weights. 

In Table 14 are the principal tests in 
which the weights in this new kinesthet- 
ic factor appears for Negroes. 
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TABLE 14 


KINESTHESIS-FACTOR LOADINGS IN DE- 
FINITIVE TESTS AND IN THE PILOT 
CRITERION OF NEGRO CADETS* 


Test and Code Number Ga 


Rudder Control CM120B_ 554 
Instrument Comprehension II C1l616B 238 


Mathematics B 225 
Rotary Pursuit CP410B 186 
Two-Hand Coordination CMio1A_ 173 
Mechanical Principles Cl903A 
Pilot Criterion 500 


* The kinesthesis factor appeared only in the 
analysis of the intercorrelations of tests in the 
battery administered to Negroes. 


B. AN INTERPRETIVE COMPARISON OF THE 

TRADITIONAL MULTIPLE-REGRESSION AND 

FACTORIAL APPROACHES TO THE PREDIC- 
TION OF A CRITERION 


The traditional procedure for the de- 
termination of the validity of a test 
battery has been that of the multiple- 
regression equation in which the maxi- 
mum possible validity (multiple correla- 
tion) between the criterion variable and 
the tests has been achieved through opti- 
mal weighting of each test. Although 
mathematically sound, this approach in 
the hands of many a technician has led 
to an unnecessary duplication of cover- 
age in several tests in his attempt to 
maximize the validity of each test of 
the battery. As a rule, factorially com- 
plex tests are the result. On the other 
hand, factor-analysis techniques do per- 
mit an improved control over a test 
battery in that; with the factorial struc- 
ture of the criterion known, the techni- 
cian is able to construct a few relatively 
pure tests, each one of which makes a 
unique contribution to the validity of 
the battery. Although each test individu- 
ally may contribute only moderately to 
the validity of the battery, the combined 
contribution of only a few such rela- 


tively pure tests can exceed considerably 
that of double the number tests which 
have been selected by the traditional 
multiple-regression practices. In  con- 
junction with factor-analysis procedures 
the multiple-regression technique, how- 
ever, is a valuable supplement in that it 
shows: first, how the test maker should 
optimally weight each test for maximum 
validity and, second, when this is done, 
how much variance each test contributes 
to the coefficient of multiple determina- 
tion (the multiple correlation, or valid- 
ity, coefficient squared). In the following 
paragraphs an attempt is made to show 
the complementary nature of the tech- 
niques of the multiple-regression equa- 
tion and factor analysis. 

As the factorial composition of a 
criterion becomes better known, the 
maximum possible validity of the test 
battery is enhanced. In other words, the 
sum of the common-factor variance in 
the criterion represents the maximum 
possible amount of variance which these 
factors, optimally weighted in a test com- 
posite, can yield. As the reliability of 
tests and their relative degree of purity 
in the relevant factors are increased, the 
more closely is the maximum potential 
validity approached. Of course, the addi- 
tion of any tests to a battery which may 
aid in the identification of other factors 
in a criterion and in the determination 
of the amounts of variance of these fac- 
tors merely raises the ceiling of the valid- 
ity to a new maximum. 

For the West Point group, the sum 
of the common-factor variance (com- 
munality) in the pilot criterion is .493 
for nine centroid factors extracted and 
.470 after rotation for eight real factors 
(if the residual factor variance of .014 is 
discounted and if the discrepancy of 
.009 between communality after rotation 
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to that before rotation is overlooked). 
The coefficient of multiple determina- 
tion R*, the amount of variance in the 
criterion associated with or predicted 
from the most favorable weighting of 
the tests in the composite is .3610, or 
about three-fourths of the valid factorial 
variance in the criterion. Another way of 
stating this comparison is in terms of 
the coefficient of multiple correlation, R, 
the square root of R*. With respect to 
the variance in the pilot criterion ac- 
counted for by only eight factors, the 
maximum derivable R is .692; whereas 
the obtained R is .601. 

What are some of the probable reasons 
which account for the missing fourth of 
the total potential variance permitted 
by the factorial composition of the 
criterion? First of all, to some degree, 
the lack of reliability of the tests them- 
selves may limit somewhat the contribu- 
tion of tests to R*. This argument is 
difficult either to expand or to defend 
rigorously. Correction for attenuation of 
the original intercorrelations among tests 
would permit the maximum variance as- 
sociated with true scores to be predicted. 
On the other hand, corrections could be 
made for the fallibility of the pilot cri- 
terion scores, which are probably less 
reliable than the test scores. If this was 
done, the sum of the common factor 
variance would be considerably aug- 
mented—in fact, it is likely that it would 
increase proportionately more than the 
true variance of the test scores following 
their correction for fallibility. Then, per- 
haps about as much as one-third of the 
potential factorial variance in the infal- 
lible criterion might not be predicted 
from the optimal weighting of tests (the 
scores of which are corrected for atten- 
uation). 


A second, and more fruitful, approach 
seems to be a detailed consideration of 
the factorial composition of each test. 
Among the tests of the battery, an ade. 
quate measure may be lacking for a 
factor which is highly loaded in the cri- 
terion. Moreover, in several tests, sub- 
stantial positive loadings may be present 
for factors which are negatively loaded 
in the criterion. In other words, a given 
test in some instances may correlate 
negatively with a criterion. If a test 
which correlates negatively with the cri- 
terion receives a positive beta weight, 
its variance contribution to the coefh- 
cient of multiple determination, which 
is the product of its beta weight and its 
coefficient of correlation with the crite- 
rion, is negative. Obviously, in this in- 
stance pilot-selection is biased in a di- 
rection antagonistic to the requirements 
of the practical situation. 

Although detailed consideration subse- 
quently will be given to the contribu- 
tion of individual tests to R?, it is il- 
luminating at this point to mention that 
eight tests of the battery administered 
to West Point Cadets actually yield nega- 
tive variances. (See Table 15 for listing 
of the relative contributions of tests to 
the total predicted variance R? for the No- 
vember, 1943, and September, 1944, bat- 
teries.) In large measure these negative 
contributions can be explained in terms 
of the presence of high positive loadings 
of tests in factors of perception and 
number which are weighted —.oqo and 
—.o53 in the pilot criterion. The sum of 
the variance contribution of these eight 
tests is-.oggo. For purposes of illustra- 
tion only, if the factor structure of the 
criterion and the contributions of the 
remaining thirteen tests to R? could be 
assumed constant, the elimination of 
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these eight tests would mean that R? 
would be .410—not too much less than 
the ceiling of .470. 

Among six tests yielding negligible, 
but positive amounts of variance (less 
than .o1) to the coefficient of multiple- 
determination, a marked degree of com- 
plexity is apparent among five which 
are loaded positively in factors contain- 
ing both positive and negative weights 
in the criterion. Hence, further cultiva- 
tion of these tests through increasing 
their loadings in valid factors would 
allow a further narrowing in the differ- 
ence between the coefficient of multiple- 
determination, R*, and the factorial vari- 
ance in the pilot criterion. 

For the Negro group, the sum of the 
common factor variance for the eight 
factors in the pilot criterion after final 
rotation, is only .g81 compared with .470, 
identified by the eight factors in the 
pilot criterion of the West Point Cadets. 
In terms of the proportion of the factor- 
ial variance of the criterion which the 
coefficient of multiple-determination rep- 
resents, the result is somewhat less for 
the Negro group than for West Point 
Cadets. The value of .1772 for R? is 
approximately one-half of .381. Consid- 
erably less than that for the West Point 
group, the total amount of negative 
variance contributed by three tests is 
-.0140. In part, the negative contribu- 
tions of these tests are explained by the 
presence of negative loadings of -.o50 
and -.o60 in the pilot-criterion for the 
respective factors of psychomotor coor- 
dination and reasoning-number which ap- 
pear in the tests. In part, however, these 
negative contributions are indicated by 
negative beta coefficients which are a 
function of not only the correlation be- 
tween a test and the pilot criterion, but 
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also the correlation of a test with all 
other tests in the battery. 

However, the need for new tests to 
aid in the identification of other factors 
in the pilot-criterion of the Negroes is 
apparent. As suggested previously, im- 
provements of tests in the battery 
through an increase in the degree of 
purity whenever feasible and through 
the addition of tests which describe 
better the relevant factor-variance of the 
pilot-criterion are required. For ex- 
ample, the presence of the factor tenta- 
tively identified as kinesthesis in the 
pilot criterion suggests the importance 
of cultivation of tests loaded in this fac- 
tor. In turn, these new tests may furnish 
additional information concerning the 
factorial composition of the pilot-crite- 
rion for Negroes. Finally, the lack of 
reliability of the pilot pass-fail criterion 
itself for the Negro group is probably 
greater than that for the West Point 
Cadets in view of the lower communal- 
ity in the pilot-criteron. In other words, 
the difference between the communali- 
ties of the two criteria of the West Point 
group and of the Negroes cannot be 
reasonably attributed just to specific, or 
unidentified, factor variances. 


1. Relative Contributions of Tests in 
the Two Batteries to the Predicted 
Variances 


For each, of the tests in the two bat- 
teries administered to West Point Cadets 
and Negroes, a summary in Table 15 
is presented of the beta weights, of the 
biserial correlations of each test with 
the pilot criterion, and of the relative 
contribution of each test to the pre- 
dicted variance. Since the coefficient ol 
multiple-determination R? is the sum of 
the products of the beta weight of each 
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test and of the coefficient of correlation 
of the test with the criterion variable: 

R? = Ber.osa +++ wher + 
to the predicted variance of each test is 
clearly its beta weight multiplied by its 
coeficient of correlation with the cri- 
terion: 


Bei.s2 ci 
where c = the criterion, 


where i = the ith test, 

where n = the number of tests, 

where r,; = the correlation between 
the ith test and the criterion, 

where $,; = the beta (optimal) weight 
applied to test in the composite, 

where R?2 = the coefficient of multi- 
ple-correlation squared (the coef- 
ficient of multiple-determination). 


Among the tests furnishing the largest 
amounts of variance (in excess of .0100 
to the predicted variance of .3610 (R?) 
in the battery administered to the West 
Point Cadets are the following: Mechani- 
cal Principles,- Clgo3B, .0864; Instru- 
ment Comprehension, C1616C, .07532; 
Rudder Control, CM120B, .0589; Rotary 
Pursuit, CP410B, .0572; Biographical 
Data—Pilot, CE602D, .0510; Dial and 
Table Reading, CP622-21A, .0306; and 
Complex Coordination, CM701A, .0258. 

Seven tests together contribute a vari- 
ance of .3851 which exceeds the pre- 
dicted variance of .g610 derived from the 
optimal weighting of twenty-one tests in 
the battery. The negative variances in 
eight tests account for this rather sur- 
prising fact. 

With the exception of the tests of 
Rotary Pursuit and Instrument Compre- 
hension, inspection of the factor load- 
ings of the other five tests (see Table 5) 
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shows them to be complex. Inasmuch as 
most loadings in these tests are in factors 
with high weights in the pilot criterion, 
the high degree of variance-contribution 
is not surprising. However, the fact that 
two of the most nearly pure tests con- 
tributed .0752 and .0572 to the total pre- 
dicted variance indicates that four or 
five such tests (one for each factor with 
a substantial loading in the criterion), 
could jointly yield as much variance as 
several complex tests which overlap in 
their function. 

For the Negro population the follow- 
ing tests supply the largest amounts of 
variance to the total predicted variance: 
Instrument Comprehension II, C1616B, 
.0890; Rudder Control, CM120B, .0277; 
General Information, CE505E, .0132; 
Biographical Data—Pilot, CE602D, .0115; 
and Dial and Table Reading, 
CP622-21A, .0113. 

The sum of the variances for these five 
tests is .1527, am amount slightly less 
than that of the total predicted variance 
of .1772. The test in instrument compre- 
hension actually contributes more than 
half the variance to the coefficient of 
multiple determination. Although this 
test is relatively pure for the West Point 
Cadets, it is for the Negroes, extremely 
complex, with loadings of .374, .372, 
.369, .2g0, and .238 in the factors of 
perceptual speed, mechanical experience, 
spatial relations, verbality, and kines- 
thesis, which in the criterion are 
weighted respectively .260, .130, .190, 
.0g8, and .500. Similarly, it can be shown 
that the other four tests are relatively 
complex. 


2. Factorial Estimates of the Validity 
of Tests 


The validity of each test in a given 
factor is indicated by the correlation of 
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the test with that factor; in other words, 
by the loading of the test in the factor. 
With respect to the criterion, instead of 
the factor, the validity of a test has been 
defined previously as the sum of the cross 
products of the paired factor loadings 
in the test and the criterion. 

Discrepancies between the biserial 
correlation coefficients and the factorial 
estimates of validity are negligible. For 
the West Point group the largest single 
discrepancy between a reproduced coef- 
ficient of correlation and the original is 
+.030. With the exception of two 
discrepancies of —.o240 and —.og01, all 
others are less than |.0200|. In the Negro 
group, somewhat larger discrepancies 
occur in the reproduction of the biserial 
correlation coefficients, although the 
largest is —.04g0. Three other repro- 
duced coefficients deviate from the 
original biserial coefficient more than 
|.040|. Since the cross-products of the 
paired loadings in eight factors were 
used instead of nine (as was done for 
the West Point group), the somewhat 
larger discrepancies of the Negro group 
seem reasonable. 

Comparisons may be made of the 
validities of different tests in the same 
battery as well as of the same test in 
different batteries (although the latter 
comparisons should be made _ with 
greater caution). For example, the con- 
sistently lower validity coefficients for 
the psychomotor tests in the Negro 
group are in essential agreement with 
the presence of a slight negative loading 
(—.050) in the psychomotor-coordina- 
tion factor in the Negro pilot-criterion. 
In contrast, the higher validity coef- 
ficients of the apparatus tests in the West 
Point group correspond to the substan- 
tial positive loading of .g05 in the 
psychomotor factor in the pilot-criterion. 


3. Tests containing Substantial Amounts 
of Unique Variances 


In a battery a test loaded with a large 
portion of variance not found in any 
other test is said to contain a substantial 
amount of unique variance.® Such a test 
need not be pure, although the proba- 
bilities are that it is relatively pure if the 
amount of unique variance is great. 
Arbitrarily two conditions may be stated 
to define operationally a test which con- 
tains a substantial amount of unique 
variance: 

(1) that the loading (variance) of the 
factor in the test be equal to, or greater 
than, .500 (.250). 

(2) that the loading (variance) of no 
other test in the battery with respect to 
the factor be equal to or greater than 
-300 (.090). 

The fulfillment of such a set of stringent 
conditions tends to make the naming 
of a factor difficult, if not impossible. 

However, as was pointed out in (IV, 
A, 3b and 5), three such circumstances 
arose in the identification of factors 
(pilot interest and kinesthesis). For both 
the Negro group and the West Point 
group the presence of a large amount of 
variance in the test Biographical Data- 
Pilot served primarily to identify the 
factor of pilot interest. Similarly, for 
the factor identified as kinesthesis, only 
one test (Rudder Control) contained a 
substantial amount of variance in this 
ability. The fact that in the circum- 
stances mentioned two substantial load- 


* By this definition any test relatively pure 
with respect to a factor contains a substantial 
amount of unique variance provided that no 
other test in the battery contains a large amount 
of variance in the factor. If two or more tests 
(either complex or pure in their factorial com- 
position) yield a large portion of variance 1 
the same factor, a unique contribution is no 
longer said to be made, since neither test is the 
only one supplying variance in the factor. 
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ings: one in a single test and the other 
in the criterion exist is indicative that a 
marked contribution to the validity of 
the battery is associated with a factor, be 
it common or specific, named or un- 


named. Whether a construct such as a 
factor can be named or not is of little 
consequence if its presence may be in- 
ferred from the operations of the testing 
situation. 
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CHAPTER V 
SUMMARY AND CONCLUSIONS 


A. SUMMARY 


HE PURPOSE of the investigation was 
ascertain the contributions of 
factors both to the description of tests 
and to the predictive values of these tests 
in two pilot populations of the United 
States Army Air Forces. To two samples, 
consisting of 815 West Point Cadets and 
of 356 Negro cadets, were administered 
respectively the November, 1943, Classi- 
fication Battery, consisting of twelve 
pencil-and-paper tests and of six ap- 
paratus tests and the September, 1944, 
Classification Battery consisting of fifteen 
pencil-and-paper tests (of which seven 
were identical with those of the first 
battery) and of the same six psychomotor 
tests. That the two samples were non- 
homogeneous was demonstrated by the 
application of the Fisher t-test to the 
differences between mean composite 
scores (stanine standings) of the two 
groups, and to the differences between 
mean scores of tests identical to the 
two batteries. All differences between 
sample means were significant beyond 
the one per cent level. 

In Chapter I, the problem of the in- 
vestigation was clarified in terms of two 
major questions and in terms of sub- 
questions pertaining to them. In the first 
question the following points were pro- 
posed: (1) the identification and interpre- 
tation, for the two groups, of factors 
derived from two matrices of intercorre- 
lations; (2) a comparison of the weights 
of the identified factors in the two 
groups with the loadings in the same 
factors for a third representative white 
aviation-cadet population; and (3) a 
comparison of the factorial composition 
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of the pass-fall criteria (each of which 
was included in respective matrices of 
intercorrelations) of the three popula- 
tions. In the second question the differ- 
ences between the two groups in the 
prediction of the pass-fail criterion from 
scores in the tests optimally weighted was 
proposed for consideration, along with 
the relationship of the traditional multi- 
ple-regression techniques to approaches 
employed in factor analysis. 

In addition to the definition of the 
two populations in operational terms, 
both a brief rationale for the inclusion 
of tests in the two batteries and a de. 
tailed description of the tests with re- 
spect to their purpose, content, scoring- 
formulae, and time limits were presented 
in Chapter II. Such background material 
was deemed important to a meaningful 
interpretation of factors. 

In Chapter III followed a survey of 
the statistical procedures, which closely 
paralleled in sequence the two major 
questions proposed in the statement of 
the problem in Chapter I. Two factor 
analyses following the Thurstone system 
were made of the two matrices of inter- 
correlations consisting of nineteen and 
twenty-two variables respectively for the 
two groups of Negroes and West Point 
Cadets. For these two matrices, eight 
centroid and nine centroid factors, re- 
spectively, were extracted. By the Zim- 
merman method, a psychologically mean- 
ingful rotation of eight factors was 
effected for each matrix. 

The Doolittle method was employed 
in the prediction of the pass-fail cti- 
terion from the tests of each battery. The 
relative contribution of each optimally 
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weighted test to the total predicted 
variance of the pilot criterion (coefficient 
of multiple determination) was com- 
puted by the multiplication of the beta 
weight of the test and its coefficient of 
correlation with the criterion. Compari- 
sons were then made of the variance- 
contributions of each test in a given 
battery to the total predicted variance. 

In Chapter IV, the sequence of the 
two major divisions concerned with the 
interpretation of the statistical results 
was the same as that set up in Chapter 
Ill. For the West Point Cadets the eight 
rotated factors were identified as me- 
chanical experience, number, pilot inter- 
est, psychomotor coordination, percep- 
tual speed, reasoning, spatial relations, 
and verbality. For the Negro group, 
variance in seven of these eight factors 
appeared. A general intellectual factor 
made up of both numerical and reason- 
ing components was evident. An eighth 
factor, tentatively identified as kines- 
thesis emerged. 

Detailed comparisons were made of 
the factor-weights in the tests adminis- 
tered to the two groups. Moreover, load- 
ings in eight factors for a white aviation- 
cadet population, available from previ- 
ous analyses, were compared with those 
in tests administered to either or both 
of the two groups of West Point Cadets 
and Negroes. 

Unfortunately, any difference between 
factor loadings in the same test or in 
different tests administered to two or 
more groups cannot be tested for sta- 
tistical significance. The impossibility 
of determining the standard error of a 
factor loading is probably the major 
disadvantage of the Thurstone system. 
Nevertheless, hypotheses were set up to 
rationalize what appeared to be notice- 
able differences among the three groups 
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in loadings of any test with respect to 
a factor under consideration. 

Similarly, attempts were made to ra- 
tionalize the differences in the factorial 
composition of the pilot-criteria of the 
three groups—differences much more 
pronounced than those appearing in the 
tests. For the West Point Cadets the 
three most valid factors were pilot inter- 
est, spatial relations, and psychomotor 
coordination, with loadings of .436, .415, 
and .305, respectively, in the pilot cri- 
terion. For the Negro cadets the three 
most valid factors were kinesthesis, per- 
ceptual speed, and spatial relations with 
loadings of .500, .260, and .190, respec- 
tively, in the pilot criterion. Previous 
analyses revealed that for representative 
white aviation-cadets (mostly pilots) the 
most valid factors were spatial relations, 
mechanical experience, and psychomotor 
coordination. 

Loadings in intellectual factors such 
as number, reasoning, and verbality were 
low in the pilot criterion for all three 
groups; in fact, the number factor ac- 
tually contributed negatively to the pre- 
diction of pilot success for the three 
groups. Negative contributions of vari- 
ance to the total predicted variance in 
tests emphasizing numerical operations 
substantiated the importance of these 
negative weights in the criterion. 

Two rather surprising results in the 
factor analyses were: first, the relatively 
high loadings for Negroes in the verbal 
factor in pencil-and-paper tests not in- 
tended to measure the verbal factor; 
second, the negative loading of the 
psychomotor factor in the pilot criterion 
for Negroes. In the former instance, the 
hypothesis was proposed that the level 
of reading difficulty of the lengthy di- 
rections of several pencil-and-paper tests 
more nearly approximated the true level 
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of the Negroes than did the test in read- 
ing comprehension. For the second re- 
sult, an hypothesis was suggested of 
inhibition in learning, or negative trans- 
fer effect. Carried over from previous 
experiences in which a high degree of 
psychomotor coordination had been de- 
veloped, certain patterns of motor re- 
sponse might have blocked the successful 
execution of new motor reactions re- 
quired in the early stages of pilot train- 
ing. 

Relating the traditional multiple- 
regression techniques to the newer 
approaches of factor analysis proved 
illuminating. Stress was placed upon the 
complementary contributions of the two 
procedures. For the two groups of West 
Point Cadets and Negroes, the respective 
amounts of total predicted variance (or 
the coefficients of multiple determina- 
tion) were approximately three-fourths 
and one-half of the communality (sum 
of the common-factor variance) of the 
pilot criterial. For the two groups the 
coefficients of multiple determination 
were .3610 and .1772 respectively, com- 
pared with the respective communalities 
of .484 and .381 in the pilot criteria 
after rotation. In both groups, several 
tests contributed negative amounts of 
variance to the total predicted variance. 
To a considerable extent the presence 
of negative loadings of factors in the 
pilot criterion with which these tests 
were loaded accounted for the negative 
contributions of variance. Although 
psychomotor tests were highly valid for 
West Point Cadets, the slight negative 
weight of the psychomotor factor in the 
Negro criterion accounted for the small 
positive portions of variance contributed 
by apparatus tests to the coefficient of 
multiple determination. In order that 


validity might be maximized, two 


courses of action were suggested: the in- 
troduction of new tests to aid in the 
identification and description of other 
potential factors in the pilot criterion, 
and the purification of tests with respect 
to factors highly weighted in the ci- 
terion. 

The application of traditional multi- 
ple-regression techniques to the study of 
validity indicated that pencil-and-paper 
tests individually furnished as much 
variance to the total predicted variance 
as the psychomotor tests, or perhaps even 
more variance than these apparatus tests. 
Particularly important was the finding 
that, for the West Point groups, the 
spatial relations factor could be better 
described by a form of pencil-and-paper 
test than by any one of the psychomotor 
tests. 

In general, the more factorially com- 
plex tests furnished the largest amounts 
of variance to the total predicted vari- 
ance. A form of a test in instrument 
comprehension, which was highly com- 
plex factorially for the Negro group, 
yielded more than half the variance to 
the coefficient of multiple determination. 
However, for the West Point Cadets, 
two relatively pure tests—a different 
form of the test in instrument compre- 
hension, and an adaptation of the 
familiar rotary pursuit task—made the 
second and third greatest contributions 
of variance, .0752 and .0589 respectively. 

For the West Point Cadets, one test 
contained a_ substantial amount of 
unique variance (variance not common 
to any other test) in the factor of pilot 
interest. The identification of the factor 
as pilot interest is somewhat tentative, 
inasmuch as a single test, a biographical 
data blank, largely defined it. However, 
the presence of a high weight in the 
criterion for the same factor in which 
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the test was saturated is indicative of a 
high degree of validity for the biographi- 
cal data blank. As expected, the multi- 
ple-regression approach showed that its 
contribution to the total predicted 
variance was .0510. For the Negroes the 
same factor appeared in this test, but 
the small loading of the factor in the 
pilot criterion limited its predictive 
value (.0115 to the total predicted vari- 
ance). A second test, involving a rudder 
control task, contributed unique vari- 
ance to the validity of the battery ad- 
ministered to Negroes. The high 
weights in the factor, identified as 
kinesthesis, in both the test and the 
criterion, accounted for its variance con- 
tribution (.0277) to the coefficient of 
multiple determination. 


B. CONCLUSIONS - 


Certain limitations in the interpreta- 
tion of the results have been indicated 
previously. The lack of complete 
identity of the tests in the two batteries 
naturally restricted, to a degree, the 
comparisons which might be made in 
factor loadings of tests which were 
identical in the two batteries. The lack 
of a statistical test for the standard error 
of a factor loading prevented calculation 
of the probability of a reliable differ- 
ence between two or more factor weights. 
In the multiple-regression procedures, 
tests for linearity were not made, al- 
though superficial checks indicated that 
the assumptions of homoscedasticity and 
of rectilinearity were fulfilled to a satis- 
factory degree. 

Sweeping generalizations of the results 
obtained for the two samples to the 
description of characteristics of racial or 
socio-economic groups is definitely not 
warranted. Nor can any conclusions be 


TESTS AND CRITERIA 53 


made, from the results obtained, con- 
cerning the indication of possible genetic 
differences in the two samples. 

The following five conclusions have 
been formulated from the results of the 
investigation: 

(1) In general, the factor loadings of 
tests of the two batteries administered 
to the two groups of West Point Cadets 
and Negroes and of the same tests given 
to a representative white aviation-cadet 
population were comparable, although 
some differences did appear, notably 
with respect to factors weighted less than 
.400 in a test for any one of the three 
populations. 

(2) The factorial composition of the 
pilot criteria was markedly different for 
the three populations. Part of the ob- 
served differences may be due to the 
unreliability of the criteria. Some simi- 
larities were apparent: 

(a) The factor of spatial relations was 
valid for all three populations in the 
prediction of pilot success. 

(b) The intellectual factors of reason- 
ing, number, and verbality were not 
valid for the prediction of pilot success. 

(3) A new factor identified as kines- 
thesis appeared for the Negro popula- 
tion only. This factor was the most valid 
one for prediction of pilot success for 
Negroes. 

(4) The unique contribution of a bio- 
graphical data blank to the validity of a 
test battery was demonstrated for two 
populations of West Point Cadets and 
Negroes. A large amount of the variance 
of this test was identified to be that of 
pilot interest. 

(5) The presence of a factor identified 
as spatial relations, in a pencil-and-paper 
test as well as in apparatus tests, has 
suggested the potential economy of 
pencil-and-paper tests in the measure- 
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ment of human abilities frequently 


effected by more cumbersome devices. 


C. SUGGESTIONS FOR FUTURE RESEARCH 


As mentioned several times previously, the 
development of tests relatively pure with re- 
spect to factors present in the pilot criterion is 
necessary for raising the validity of a test compo- 
site. Now that a substantial portion of the area 
of the pilot criterion is defined, individual tests 
which describe one factor at a time might well 
be administered on an experimental basis to 
pilot cadets in the peacetime Army. Factor 
analyses of the results would be likely to indi- 
cate that further revisions in tests would be 
desirable. Moreover, several tests need to be 
introduced on a trial-and-error basis, to test 
hypotheses concerning the roles of other factors 
in the pilot criterion. 

For reasons of economy, particular attention 
should be directed toward development of 
pencil-and-paper tests which can duplicate the 
functions of the more cumbersome apparatus 
tests. For the space factor, marked progress in 
this direction already has been made. 

Separate batteries for selection of personnel 
in different air-crew positions seem to be neces- 
sary if differential prediction is to be maximized. 
Tests measuring intellectual factors undoubtedly 
need to be included in batteries administered 
to prospective navigators. However, the presence 
of such tests in a composite administered to 
certain groups of pilot candidates may detract 
from the validity of the battery, if there has 
been initial selection on a general qualifying 
examination (which measures necessary mini- 
mum intellectual requirements for pilot train- 
ing). 

Of course, persistent attempts at a better de- 
scription of the factorial structure of the cri- 
teria corresponding to various air-crew positions 
are paramount in importance. As the structure 
of the criterion in question becomes better 
known, the introduction of new tests and the 
revision of others will be necessary. 


Not to be minimized, however, is the need 
for improved criteria. It is quite probable tha 
the gross pass-fail criterion could advantage. 
ously be replaced by many independent and 
relatively pure criteria. The use of several inde. 
pendent criteria, each one of which can be 
differentiated into a number of levels of per. 
formance (instead of merely pass-or-fail), would 
appear to be a promising approach. 

The application of these techniques in the 
field of vocational guidance affords an unlimited 
area of research. A comprehensive battery o{ 
relatively pure tests measuring the more im. 
portant abilities in different occupations would 
permit a profile of one’s scores in different 
factors to be made. If cut-off points could be 
established for factor scores in different occupa. 
tions, a counselor after studying an examinee’s 
profile of scores would be in a better position 
to give vocational advice. Applications in the 
field of educational guidance are immediatel) 
apparent, 

Finally, the extension of factor-analysis pro 
cedures to the fields of personality and tempera- 
ment would permit the clinical psychologisi 
the use of the profiles suggested. Moreover, the 
inclusion in the same battery of pure tests de- 
signed to measure both aptitudes and personality 
factors would serve to narrow the gap between 
these two areas of testing and to allow joint use 
of these factor scores in vocational guidance, 
provided of course that the pattern of the 
temperament and aptitude factors in a given 
occupation had been ascertained. 

In short, the potentialities of the factor- 
analysis procedures barely have been realized. 
It is only through continuous research that the 
horizon of man’s knowledge of the primary 
abilities and of the components of temperament 
may be broadened. In combination with other 
procedures of demonstrated merit, the tech- 
niques of factor analysis afford at the present 
time a most encouraging means for the solution 
of fundamental problems in psychological and 
educational testing and in experimental psy 
chology. 
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